pluiez / NLLB-inference

56 stars 6 forks source link

Average time for prediction #5

Open Okohedeki opened 2 years ago

Okohedeki commented 2 years ago

Hello,

I was curious how long it took for your prediction to run? It took a couple of seconds for me so I was wondering if that was just due the the NLLB model or if you experienced something different, which would lead me to believe my set up is messed up somewhere.

pluiez commented 2 years ago

That pretty much depends on what device and which checkpoint you use. The smallest checkpoint has 600M parameters, actually it's already quite large compared to some commonly used pretrained models. So inference with such a large model is expected to take some time.

fatjoni commented 2 years ago

Is there some way for example to reduce the time by keeping the model loaded or idk some other way @pluiez? I noticed there is no docker version and probably i could help, but i would like to know how can you keep it preloaded so it translates in a matter of 1-2 secs rather then 30+ seconds it currently takes.

Okohedeki commented 2 years ago

If you wanted to preload it you could just host it on a server and call it from there. But due to the size a lot of cloud providers won’t let you use the largest version. I just downgraded to the msm100 until meta optimizes it

fatjoni commented 2 years ago

i am calling it locally from the server but the problem is that for each translation i have to call translate.sh which reloads to the memory the model every time