Closed samhuang1991 closed 1 month ago
Hi @samhuang1991, I will reply here to both comments you made, when I was evaluating which models to use for translation I also tested Opus-MT, which has a model for each language as you described, the problem is that the quality, as you suggested, is not good enough. With Opus-MT Big instead, the quality is close to NLLB, but the model for each language is close to the size of NLLB (232M vs 600M parameters), so in the end the reduction in RAM consumption and download size would not be too much (considering that to translate between 2 languages other than English you need 2 models). So even if I don't rule out using them in the future (especially since Opus-MT is completely open-source, without restrictions), for now I prefer to focus on optimizing NLLB to increase its quality or to make it smaller and more efficient (now OnnxRuntime supports 4bit quantization, so if everything goes well in the future I could halve the size of the current model and also give the possibility to choose bigger models).
A small model for a certain language can improve download speeds, reduce the load on our phone, and should also improve translation speeds.When the user selects the language, the onnx model file is downloaded on demand. Is there a small translation model?