Closed juliafalcao closed 1 year ago
Yep this is a bit confusing. The name choice was not the best... The wmt22-comet-da
model is the DA model we used in that ensemble. The multitask model is actually not available atm.
We are working on a new and improved metric that will closely follow that multitask model described there and I hope to release that soon.
Giving a bit more context, that metric seemed to work well for very high quality MT and specially for zh-en
, en-de
and en-ru
but was blind to really bad quality MT and correlations were not good outside those languages.
Thank you for clarifying!
Rei et al. 2022 proposed "COMET-22" as a new model which is "an ensemble between a COMET estimator model trained with DA and a newly proposed multitask model trained to predict sentence-level scores along with OK/BAD word-level tags derived from MQM error annotations." The COMET documentation, however, lists
wmt22-comet-da
as the current default model, and says it's been trained only on DA data from WMT 2017-2020, matching with what is listed in itshparams.yaml
file.So I just wanted to clarify, regarding
wmt22-comet-da
, the version that is available on HuggingFace Hub: is it a regular COMET-DA model trained only on WMT17-20 DA data, or was it fine-tuned on MQM scores as well?