Closed GabrielLin closed 2 years ago
Hi, we don't set clear boundaries for similarity and dissimilarity.
Hi @yyy-Apple Thank you for your reply. That is my concern. I use BARTScore and get a result, but I do not know what the score stands for. Could you please give us some examples if possible?
Hi @GabrielLin , I unfortunately don't think this is possible, as the good and bad values of BARTScore will depend on the experimental setting. However, this is a common problem with evaluation metrics, not just for BARTScore. For example, BLEU score doesn't really have a consistent "good" or "bad" value either, a translation system with a BLEU score of 20 could be either very bad or quite good depending on the evaluation dataset. So basically you should probably look at the outputs that you're evaluating and form an idea of what values are good or bad for your particular dataset.
Hi @neubig You are right. The reason why I ask this question is that I am not familiar with BARTScore. I agree with you. As a metric, BARTScore should be considered as other metrics like BLEU. I will make some examples myself. Thank you.
Thank you for your work. Could you please tell us what is corresponding description of the value range? Such as [-1, -3]: similar; [-3, -6] normal; [-6, -100] not similar or so. Thanks.