EuropeanaNewspapers / ner-corpora

Named Entity Recognition data for Europeana Newspapers
http://www.europeana-newspapers.eu/
Other
172 stars 31 forks source link

date of the news #49

Open jwijffels opened 4 years ago

jwijffels commented 4 years ago

Interesting resource! Thanks for sharing. I was looking to set up a NER model for 18th-19th century Dutch. Any chance somewhere I can find for each transcribed newsletter what the date was of the news? I've looked in the ALTO files and they are not there neither?

cneud commented 4 years ago

Hi and sorry for the late reply! I have been a bit more active with this on my fork here https://github.com/cneud/ner-corpora.

About the date of issue: unfortunately this information is missing in both BIO and ALTO files. I originally came up with a matching table and procedure to manually correct this, but haven't followed through for the KB data. Also since theeuropeanlibrary.org was taken down, only Delpher is left to search these pages :/