Support NLLB's LID model

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

https://huggingface.co/transformers

Apache License 2.0

135.38k stars 27.09k forks source link

Support NLLB's LID model #18294

Closed xia0nan closed 2 years ago

xia0nan commented 2 years ago

Model description

Thanks for supporting NLLB and closing this issue https://github.com/huggingface/transformers/issues/18043. I'm wondering if huggingface can further support the language identification model of NLLB? "LID (Language IDentification) model to predict the language of the input text."

Open source status

[X] The model implementation is available
[X] The model weights are available

Provide useful links for the implementation

https://github.com/facebookresearch/fairseq/tree/nllb#lid-model

xia0nan commented 2 years ago

I figured out how to use it. No issue now.

xia0nan commented 2 years ago

First download model wget https://dl.fbaipublicfiles.com/nllb/lid/lid218e.bin Then use it for inference

import fasttext
pretrained_lang_model = "lid218e.bin"
model = fasttext.load_model(pretrained_lang_model)
text = "これ、浅草に、行きますか"
predictions = model.predict(text, k=1) 
print(predictions)