facebookresearch / fastText

Library for fast text representation and classification.
https://fasttext.cc/
MIT License
25.87k stars 4.71k forks source link

🔥 Any plans to update the pre-trained model for Language Identification? #1323

Open lord-alfred opened 1 year ago

lord-alfred commented 1 year ago

I don't know the exact date when the lid.176.bin was released, but according to the web archive (https://web.archive.org/web/20180104060303/https://fasttext.cc/docs/en/language-identification.html) it's been over 5 years.

Could the developers at Facebook schedule a new training on the new data (so much has appeared on the web in 5 years!) to release an updated model?

I think a lot of people would benefit from an update. I'm sure accuracy would improve because the number of documents even on Wikipedia and Tatoeba has increased tremendously.

advnpzn commented 1 year ago

I would love the same.