piskvorky / gensim-data

Data repository for pretrained NLP models and NLP corpora.
https://rare-technologies.com/new-api-for-pretrained-nlp-models-and-datasets-in-gensim/
GNU Lesser General Public License v2.1
988 stars 133 forks source link

Incorrect outdated link #21

Closed aneesh-joshi closed 6 years ago

aneesh-joshi commented 6 years ago

There seems to be an outdated link here:

https://github.com/RaRe-Technologies/gensim-data/blob/fcc89c2d6832cbf19fdc1b896451dc3432631833/list.json#L105

The wikimedia website seems to have stopped hosting it. I couldn't figure out the json formatting so didn't make a PR.

aneesh-joshi commented 6 years ago

Referring to this link

https://dumps.wikimedia.org/enwiki/20171001

menshikh-iv commented 6 years ago

@aneesh-joshi this is OK because we pick data from this link (yes, this doesn't support more by wiki side, but this is fact only).