Unbabel / COMET

A Neural Framework for MT Evaluation
https://unbabel.github.io/COMET/html/index.html
Apache License 2.0
493 stars 76 forks source link

[QUESTION] How to finetune `wmt22-comet-da` and have results scaled to 0-1 range #131

Closed juliafalcao closed 1 year ago

juliafalcao commented 1 year ago

I am trying to fine-tune the wmt22-comet-da model on new DA data, using its checkpoint and the configs available in the configs/ folder here. I know that this model scales the result scores between 0 and 1, but after fine-tuning, the new fine-tuned model generates negative scores as well, similarly to the older COMET models.

My main question is, where exactly is the configuration I should change for the model to scale the results like wmt22-comet-da does? I haven't been able to find that in the code. My configs are exactly the same as those in configs/ except for filepaths and the checkpoint to load.

ricardorei commented 1 year ago

Yep, Julia you are right. You need first to rescale the data to be between 0 and 1.

The rescaling I used was actually borrowed from BLEURT. The rescaling is the following: 1) Find a reasonable "min value". This is done by finding all annotations with more than 1 annotator where all annotators agreed that the score was 0. Then your "min_value" is the average z-score for those segments. 2) Find a reasonable "max value". This is done by finding all annotations with more than 1 annotator where all annotators agreed that the translation is perfect (100 score). Then your "max_value" is the average z-score for those segments. 3) apply a min_max_scaler to your data and truncate every score above 1 and bellow 0.

This rescaling won't impact your model correlations and your model will output scores in that range.