d40cht / Careers

Wikipedia-dump based natural language processing for named entity recognition
3 stars 1 forks source link

Careers

Wikipedia-dump based natural language processing for named entity recognition

Based primarily on inspiration gleaned from Large-Scale Named Entity Disambiguation Based On Wikipedia Data (Silviu Cucerzan, Microsoft Research).

Wikipedia dump processing in Hadoop, disambiguation implemented in Scala.