Unbabel / COMET

A Neural Framework for MT Evaluation
https://unbabel.github.io/COMET/html/index.html
Apache License 2.0
492 stars 76 forks source link

`comet-compare` is not working #109

Closed foksly closed 1 year ago

foksly commented 1 year ago

🐛 Bug

comet-compare is not working from main branch when trying to compare translations of two models

To Reproduce

  1. Comet model method is_referenceless called in comet/cli/compare.py is not defined
  2. After fixing the bug above, there is another bug:
    wmt20-comet-da is already in cache.
    Global seed set to 12
    Some weights of the model checkpoint at xlm-roberta-large were not used when initializing XLMRobertaModel: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.layer_norm.bias']
    - This IS expected if you are initializing XLMRobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
    - This IS NOT expected if you are initializing XLMRobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Encoder model frozen.
    Some weights of the model checkpoint at xlm-roberta-large were not used when initializing XLMRobertaModel: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.layer_norm.bias']
    - This IS expected if you are initializing XLMRobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
    - This IS NOT expected if you are initializing XLMRobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Encoder model frozen.
    GPU available: True (cuda), used: True
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    HPU available: False, using: 0 HPUs
    LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [3]
    Predicting DataLoader 0: 100%|████████████████| 176/176 [00:40<00:00,  4.39it/s]
    GPU available: True (cuda), used: True
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    HPU available: False, using: 0 HPUs
    LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [3]
    Predicting DataLoader 0: 100%|████████████████| 176/176 [00:40<00:00,  4.31it/s]
    Traceback (most recent call last):
    File "/home/foksly/.pyenv/versions/3.8.7/envs/ml-metrics-server/bin/comet-compare", line 6, in <module>
    ...
    sys.exit(compare_command())
    File "/home/foksly/metrics_service/COMET/comet/cli/compare.py", line 455, in compare_command
    population_size = seg_scores.shape[1]
    IndexError: tuple index out of range

    The shape of seg_scores is (0,)

Expected behaviour

I am expecting comparison result after execution of comet-compare -s {src} -t {baseline} {experiment} -r {ref}

Environment

OS: Linux Comet was installed by cloning this repo and running poetry install

Additional context

ricardorei commented 1 year ago

I'll see investigate it.

Meanwhile you can use version 1.1.3 which is stable and that command should work.

ricardorei commented 1 year ago

This is currently fixed and we are planning for a new release v2.0.

YZhou0413 commented 1 year ago

Hallo Ricardo, I'm currently having the same Problem with comet-compare in COMET 2.0.0, in Windows:

Traceback (most recent call last): File "c:\users\username\appdata\local\programs\python\python38\lib\runpy.py", line 192, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\username\appdata\local\programs\python\python38\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\username\AppData\Local\Programs\Python\Python38\Scripts\comet-compare.exe\__main__.py", line 7, in <module> File "c:\users\username\appdata\local\programs\python\python38\lib\site-packages\comet\cli\compare.py", line 461, in compare_command population_size = seg_scores.shape[1] IndexError: tuple index out of range