freme-project / freme-ner

Apache License 2.0
6 stars 1 forks source link

FREME-NER datasets for training different classifier implementations #179

Open reckart opened 7 years ago

reckart commented 7 years ago

Are the FREME-NER datasets available to train alternative classifier implementations, e.g. Apache OpenNLP NER?

m1ci commented 7 years ago

Hi, yes, we trained on the dbpedia abstracts dataset, see: http://wiki.dbpedia.org/nif-abstract-datasets The data is in the NIF format, so you'll need to write small script which reads NIF and creates the input for the learning. This is how we created a script for NIF to StanfordNER input. It would be great to have NIF2Any input learning converter.

reckart commented 7 years ago

DKPro Core might help you out here :)

m1ci commented 7 years ago

glad to hear that! will definitely look into it.

m1ci commented 7 years ago

@reckart we have just released the latest version of DBpedia abstracts for several languages. See http://downloads.dbpedia.org/2016-10/core-i18n/ which are nice source for training NER.

Let us know if you have any questions.

Best, Milan

reckart commented 7 years ago

@m1ci puh, these files are huge! I was kind of hoping for one a ZIP with ttl file per article. How do you work with such large files? Would you recommend some RDF store?