marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.22k stars 228 forks source link

CPU batched translation performance? #349

Open BigBorg opened 3 years ago

BigBorg commented 3 years ago

I have deployed a model with mini-batch: 2 , maxi-batch:2 and cpu-threads:8. When running performance test, I observed that a sentence with 60 tokens took 1.7 seconds to translate while 2 sentences took 2.4 seconds. Adding more sentences resulted in about 0.1 more second per sentence. Why is translating 2 sentences so much slower than 1 sentence? How is sentences distributed to different cpu cores? Is it reasonable to assume that a batch translation should take about the same time as translating the longest sentence in the batch?