Closed jamie0725 closed 2 months ago
In terms of speed, currently to translate 30k documents of about 300 words each, it takes 10+ hours on a single gpu. Is this expected?
There was a good answer on that stackoverflow: Helsinki-NLP models were originally trained in Marian and then converted to Huggingface Transformers. Marian is a specialized tool for MT and is very fast. If you do not need the internals of the models and only need the translation, it should be a better choice.
I have been able to get the transformers working and it is taking 2-3 seconds per paragraph with a new 2022 laptop with 64 gb ram and A2000 gpu.
I am trying the dockerized version of Opus-MT but when I run it is so far giving me incomplete or garbage response.
docker build -f Dockerfile.gpu . -t opus-mt-gpu
nvidia-docker run -p 8888:8888 opus-mt-gpu:latest
~/git/Opus-MT$ echo "I am a dog" | ./opusMT-client.py -H 172.17.0.2 -P 10001 -s en -t es
Soy un
I am trying to translate company profiles for 80k stock symbols into european languages. just venting. I don't want to start the bulk translations until I can get it running at <1 second.
For batch translation it would be better to run directly through the marian-decoder and not through the server/client setup. Also note that the opusMT server/client implementation does not do batching and, therefore, does not really use the full power of a GPU.
Hi,
Firstly, thanks for making your translation models publicly available. It is really helpful for the industry.
I have a question though, related to this question, if I am going to translate a large amount of text, what is the best way to use your models? Currently I am using the transformers library, but the speed is pretty slow even on gpu, which is not satisfying enough.