vnadgir / dkpro-core-asl

Automatically exported from code.google.com/p/dkpro-core-asl
0 stars 0 forks source link

Integrate AraNLP #566

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Integrate AraNLP.

---

https://sites.google.com/site/mahajalthobaiti/resources

AraNLP library is a Java-based toolkit for the processing of Arabic text. It 
supports the most important preprocessing steps, such as diacritic and 
punctuation removal, tokenization, sentence segmentation, part-of-speech 
tagging, root stemming, light stemming, and word segmentation. These tools are 
usually required to prepare the text for more advanced NLP tasks. 
The goal of AraNLP is to gather most of the vital Arabic text preprocessing 
tools into one library that can be accessed easily. Therefore, We incorporated 
missing tools and included existing algorithmic resources. 
AraNLP has already been used in many experiments to prepare the Arabic text and 
it successfully preprocessed the corpus.

Please cite our paper in any published work using this resource: 
@inproceedings{Althobaiti14AraNLP,
  title={{AraNLP: a Java-Based Library for the Processing of Arabic Text}},
  author={M. Althobaiti and U. Kruschwitz and M. Poesio},
  booktitle={Proceedings of the 9th Language Resources and Evaluation Conference (LREC)},
  year={2014},
  address = {Reykjavik}
}

Original issue reported on code.google.com by richard.eckart on 17 Dec 2014 at 9:56