SentSim metric - Githubissues

joannekim0420 commented 3 years ago

Hi, I saw the thesis you wrote and followed the code in your github but had a few questions about the new metric you proposed. It says that your final sentsim equation is SentSim(A, B) =X∞ (w1 ∗ An + w2 ∗ Bn/n!) where n starts 1, but in your github says exp(A)+exp(b). which one should I follow? Also, whatever equation metric I use from your code or your paper, the sentsim score doesn't quite match the result I'm hoping for. It appears the score range exceeds 1 which I thought was the maximum score output. To prove I wasn't doing anything wrong, I also tried calculating scores based on your results from your paper such as example 1 from table 6. The scores I used for calculation was in this order BERTScore, SSS, SentSim, 0.954 0.778 0.821. The metric output 1.22482 which I got with your equation from paper doesn't seem to match the number of 0.821. Could you please inform me on what's the problem here and what I should do to fix this?

Rain9876 commented 3 years ago

Hi,

For your questions

Q1: You should use exp(A) + exp(B), which is equivalent to SentSim(A, B) =X∞ (w1 ∗ An + w2 ∗ Bn/n!).

Q2: The score range could exceed 1 if you don't normalize.

Q3: Both A and B are normalized under the dataset, which means you need to give a specific dataset before calculation or you would need to specify the range of BERTScore and SSS to normalize. Although BERTScore is ranged between 0-1, it is much likely closer to 1 than 0, which means it needs to be normalized before combination. It is the same with SSS. The score would be slightly different if you give the range of BERTScore or the dataset is different. But the final correlation shows the better performance with the human score.

Hope these explanations answer your question.

Yurun

joannekim0420 @.***> 于2021年9月28日周二上午9:40写道：

Hi, I saw the thesis you wrote and followed the code in your github but had a few questions about the new metric you proposed. It says that your final sentsim equation is SentSim(A, B) =X∞ (w1 ∗ An + w2 ∗ Bn/n!) where n starts 1, but in your github says exp(A)+exp(b). which one should I follow? Also, whatever equation metric I use from your code or your paper, the sentsim score doesn't quite match the result I'm hoping for. It appears the score range exceeds 1 which I thought was the maximum score output. To prove I wasn't doing anything wrong, I also tried calculating scores based on your results from your paper such as example 1 from table 6. The scores I used for calculation was in this order BERTScore, SSS, SentSim, 0.954 0.778 0.821. The metric output 1.22482 which I got with your equation from paper doesn't seem to match the number of 0.821. Could you please inform me on what's the problem here and what I should do to fix this?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Rain9876/Unsupervised-crosslingual-Compound-Method-For-MT/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFQCHUT6L4MZ2LL7XVO573LUEEMKVANCNFSM5E4DYLDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

joannekim0420 commented 3 years ago

thanks for the quick reply. I just have one more question. So, in this example of your paper, 0.954 0.778 0.821 (BERTScore, SSS, SentSim), 0.954 and 0.778 are before normalized outputs of BERTScore&SSS and the 0.821 is the output of combination with the normaliztion?

Rain9876 commented 3 years ago

I remember all of them are normalized, but I can't remember if it relies on de-en or zh-en distribution so sure.

But even you don't use normalization, you can still realize that BERTScore is not sensitive to the negation though un-normalized BERTScore F1, which is what the example want to express.

Yurun

joannekim0420 @.***> 于2021年9月29日周三下午12:31写道：

thanks for the quick reply. I just have one more question. So, in this example of your paper, 0.954 0.778 0.821 (BERTScore, SSS, SentSim), 0.954 and 0.778 are before normalized outputs of BERTScore&SSS and the 0.821 is the output of combination with the normaliztion?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Rain9876/Unsupervised-crosslingual-Compound-Method-For-MT/issues/1#issuecomment-929826877, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFQCHUSQEWME46PKFOYQ72TUEKJANANCNFSM5E4DYLDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

joannekim0420 commented 3 years ago

Hi, after another experiment I still find it difficult to get the scores in the range of [0,1]. I've normalized my BertScore and SSS output before inputting to combine metric. But the output of combine metric returns range of 2.5~5.2 ish which is understandable because the combine metric uses exponential. For example, one of my normalized scores for BERTScore and SSS were 0.60979676, 0.14238505349698224. And if you calculate sentsim(= exp(A) + exp(B)) with these numbers using your code, the results show 2.9930779107079846. I don't get how giving numbers range of [0,1] to exponential get the outcome range of [0,1] as well. 1200px-Exp svg So it made me wonder, should the results of sentsim should also be normalized as well? Or are you suppose to use negative exponential like in the papers you used as reference(Kilickaya et al., 2017; Clark et al., 2019)? To me, using negative exponential seems more appropriate in order to get the output range of [0,1].

Rain9876 / Unsupervised-crosslingual-Compound-Method-For-MT

SentSim metric #1