Open SefaZeng opened 1 year ago
pip install unbabel-comet comet-score -s ../mt-metrics-eval-v2/wmt22/sources/en-zh.txt -t ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt -r ../mt-metrics-eval-v2/wmt22/references/en-zh.refA.txt > log.comet Global seed set to 1 Fetching 5 files: 100%|█████████████████████████| 5/5 [00:00<00:00, 88487.43it/s] Lightning automatically upgraded your loaded checkpoint from v1.8.3.post1 to v1.9.5. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file ../../../../../root/.cache/huggingface/hub/models--Unbabel--wmt22-comet-da/snapshots/371e9839ca4e213dde891b066cf3080f75ec7e72/checkpoints/model.ckpt` Encoder model frozen. GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] Predicting DataLoader 0: 100%|████████████████████| 128/128 [01:00<00:00, 2.10it/s] [1] 13312 segmentation fault comet-score -s ../mt-metrics-eval-v2/wmt22/sources/en-zh.txt -t -r >
And the result in log.comet is like:
../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 0 score: 0.8275 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 1 score: 0.8833 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 2 score: 0.7753 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 3 score: 0.9103 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 4 score: 0.8103 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 5 score: 0.9792 ... ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 2033 score: 0.9494 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 2034 score: 0.9332 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 2035 score: 0.9397 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt Segment 2036 score: 0.9048 ../mt-metrics-eval-v2/wmt22/system-outputs/en-zh/HuaweiTSC.txt score: 0.8622
Output the system score for candidates.
If applicable, add screenshots to help explain your problem.
OS: [e.g. iOS, Linux, Win] Packaging [e.g. pip, conda] pip/py39 Version [e.g. 0.5.2.1] 2.0.1
And the results for wm22 en-zh HuaweiTSC.txt do not match that in the wmt22 released COMET-22-refA.sys.score. The score in the file is 0.47647532625047895.
COMET-22-refA.sys.score
🐛 Bug
To Reproduce
And the result in log.comet is like:
Expected behaviour
Output the system score for candidates.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment
OS: [e.g. iOS, Linux, Win] Packaging [e.g. pip, conda] pip/py39 Version [e.g. 0.5.2.1] 2.0.1
Additional context