[QUESTION] Keep getting scores of '0' no matter what input used

Unbabel / COMET

A Neural Framework for MT Evaluation

https://unbabel.github.io/COMET/html/index.html

Apache License 2.0

493 stars 76 forks source link

[QUESTION] Keep getting scores of '0' no matter what input used #190

Closed Brecony76 closed 6 months ago

Brecony76 commented 10 months ago

What is your question?

I keep getting scores of 0 no matter what input I give it

Code

`from comet import download_model, load_from_checkpoint

model_path = download_model("Unbabel/wmt22-comet-da") model = load_from_checkpoint(model_path)

data = [ { "src": "10 到 15 分钟可以送到吗", "mt": "Can I receive my food in 10 to 15 minutes?", "ref": "Can it be delivered between 10 to 15 minutes?" }, { "src": "Pode ser entregue dentro de 10 a 15 minutos?", "mt": "Can you send it for 10 to 15 minutes?", "ref": "Can it be delivered between 10 to 15 minutes?" } ]

if name == 'main': model_output = model.predict(data, batch_size=8, gpus=1) print(model_output) print(model_output["scores"]) # sentence-level scores print(model_output["system_score"]) # system-level score `

-output Prediction([('scores', [0.0, 0.0]), ('system_score', 0.0)]) [0.0, 0.0] 0.0

What's your environment?

OS: Windows 10
Packaging Pip 23.3.1
Version Comet 2.2.0

ricardorei commented 9 months ago

Hey @Brecony76. I am not able to replicate this error. I just tried it and I get the following scores:

Prediction([('scores', [0.8417137265205383, 0.7745385766029358]), ('system_score', 0.8081261515617371)])

clang88 commented 8 months ago

Hi @Brecony76 I'm observing the same issue.

OS: Windows 10
unbabel-comet 2.2.1
pip 23.3.1
Python 3.10.13
torch 2.1.2+cu121
Geforce 250MX (Driver Version: 537.79 CUDA Version: 12.2) (Yeah... it's my work laptop)

The behavior is particularly odd, because sometimes it does actually return a score, with no change in code or data... I'm not sure how to reproduce the 0.0 scores, nor the proper scores. Sometimes it just works, sometimes it doesn't. I will retest this tomorrow, to see if I can make any sense of it. For now I completed my task of evaluating some translations with Comet (thanks to the devs and researchers for making this so intuitive!)

BramVanroy commented 7 months ago

I can confirm that this issue exists on Windows. It might be related to this CUDA warning:

[W CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

But I am not sure and do not have time to dig into this deeper. It is a shame though, as this makes COMET unfortunately unreliable on Windows.

BramVanroy commented 7 months ago

I've done some digging but haven't found a solution, although I have pinpointed the place in the PL Trainer where something goes wrong. The model weights are turned to zero but I do not know why.

To put this into higher priority, feel free to comment on the issue that I raised over at PyTorch Lightning to indicate that you are also experiencing this problem. https://github.com/Lightning-AI/pytorch-lightning/issues/19537

awaelchli commented 7 months ago

I left a reply in https://github.com/Lightning-AI/pytorch-lightning/issues/19537#issuecomment-1974787881 with a suggestion. I hope it provides some useful insights.