Unbabel / COMET

A Neural Framework for MT Evaluation
https://unbabel.github.io/COMET/html/index.html
Apache License 2.0
490 stars 76 forks source link

Segment order incorrect when running on >1 GPU #101

Closed thompsonb closed 1 year ago

thompsonb commented 1 year ago

🐛 Bug

Segment order is incorrect when using more than one GPU.

To Reproduce

comet-score -s src.de -t junk.en -r ref.en --model wmt21-comet-mqm --gpus 0 junk.en Segment 0 score: 0.0036 junk.en Segment 1 score: -0.0050 junk.en Segment 2 score: 0.0029 junk.en Segment 3 score: 0.0426 junk.en Segment 4 score: 0.0481 junk.en Segment 5 score: 0.0511 junk.en Segment 6 score: 0.0501 junk.en Segment 7 score: 0.0325 junk.en Segment 8 score: 0.0434 junk.en Segment 9 score: 0.0652 junk.en Segment 10 score: 0.0538 junk.en Segment 11 score: 0.0465 junk.en Segment 12 score: 0.0490 junk.en Segment 13 score: 0.0564 junk.en Segment 14 score: 0.0448 junk.en Segment 15 score: 0.0564 junk.en Segment 16 score: 0.0549 junk.en Segment 17 score: 0.0461 junk.en Segment 18 score: 0.0577 junk.en Segment 19 score: 0.0573 junk.en score: 0.0429

comet-score -s src.de -t junk.en -r ref.en --model wmt21-comet-mqm --gpus 4 junk.en Segment 0 score: 0.0036 junk.en Segment 1 score: 0.0481 junk.en Segment 2 score: 0.0434 junk.en Segment 3 score: 0.0490 junk.en Segment 4 score: 0.0549 junk.en Segment 5 score: -0.0049 junk.en Segment 6 score: 0.0511 junk.en Segment 7 score: 0.0650 junk.en Segment 8 score: 0.0563 junk.en Segment 9 score: 0.0460 junk.en Segment 10 score: 0.0029 junk.en Segment 11 score: 0.0501 junk.en Segment 12 score: 0.0538 junk.en Segment 13 score: 0.0449 junk.en Segment 14 score: 0.0577 junk.en Segment 15 score: 0.0426 junk.en Segment 16 score: 0.0325 junk.en Segment 17 score: 0.0465 junk.en Segment 18 score: 0.0564 junk.en Segment 19 score: 0.0573 junk.en score: 0.0429

In this case, segment 0 -> segment 0, segment 1-> segment 5, segment 2 -> segment 10, etc

See also: https://github.com/amazon-science/doc-mt-metrics/issues/8

ricardorei commented 1 year ago

Thanks @thompsonb

I have to refactor the code to use a CustomWriter from Pytorch Lightning.

This was something that was not supported a few pytorch lightning versions ago.

Ill try to work on it after EMNLP

ricardorei commented 1 year ago

I made some significant changes on inference with multi-gpu. The segment order issue is solved and its also faster to run inference with multi-gpu now.

This will be merged on the next release.