JuliaText / CorpusLoaders.jl

A variety of loaders for various NLP corpora.
Other
32 stars 13 forks source link

NER Datasets #27

Open Ayushk4 opened 5 years ago

Ayushk4 commented 5 years ago

I will be testing the currently WIP, NER API on a number of datasets. I could add support for these in CorpusLoaders as well. Some of these could potentially be among the following -

Should I add corpora/datasets like these to CorpusLoaders?

oxinabox commented 5 years ago

Sounds like a good idea to me

Ayushk4 commented 5 years ago

While adding Groningen Meaning Bank (via DataDeps), I am encountering this error - OError(MbedTLS error code -9984: X509 - Certificate verification failed, e.g. CRL, CA or signature check failed during request(https://gmb.let.rug.nl/releases/gmb-1.0.0.zip)) . Also giving the same on JuliaBox. Is this something, that has been added to prevent bots from downloading the data? Kaggle is also not allowing to directly download data, also tried using DataDepsGenerators.