mayabot / mynlp

一个生产级、高性能、模块化、可扩展的中文NLP工具包。(中文分词、平均感知机、fastText、拼音、新词发现、分词纠错、BM25、人名识别、命名实体、自定义词典)
https://mynlp.mayabot.com/
Apache License 2.0
675 stars 90 forks source link

Cannot load pre-trained word vectors #22

Closed liefra closed 4 years ago

liefra commented 4 years ago

I tried to load a pre-trained word vector, but receive the following error: Caused by: java.lang.IllegalArgumentException: Unknown LossName enum second :774911284

I load the model with: val model = FastText.loadModelFromSingleFile(File("/Users/liefra/crawl-300d-2M.vec"))

Is this an issue, or just me doing it wrong?

jimichan commented 4 years ago

SingleFile is one Java Model File used by fasttext4j. maybe you can use FastText.loadCppModel load bin mode file from Fasttext https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz (unzip it before load )

liefra commented 4 years ago

Thank you very much for your super fast reply :)

Yes, it works when I load it with the bin format: val model = FastText.loadCppModel(File("/Users/liefra/cc.en.300.bin"))