tingofurro / summac

Codebase, data and models for the SummaC paper in TACL
https://arxiv.org/abs/2111.09525
Apache License 2.0
75 stars 19 forks source link

How to interpret negative scores produced by SummaCzs #7

Open chris-opendata opened 1 year ago

chris-opendata commented 1 year ago

I got negative scores from SummaCzs model following the example usage with my data. For example, scores look like the following.

        "scores": {
            "for_reference_summary": [
                -0.193,
                -0.746
            ],
            "for_generated_summary": [
                -0.259,
                -0.402
            ]
        },

Is it normal to have negative scores? How can I interpret the negative scores if so?

Thanks

tingofurro commented 1 year ago

Hello @chris-opendata,

Yes, SummaC-ZS scores in the default setting range between [-1, 1] as they are calculated as:

P(entail) - P(contradict)

(so the contradict component can be negative). Optionally, you can set SummaCZS([...], use_con=False) which will remove the second component in the equation above and always return a positive number.

Experiments in the paper did confirm that the scores are slightly improved when considering the difference, rather than either on their own, which is why we chose this setting for the default.

I hope this helps,

Philippe

chris-opendata commented 1 year ago

Hi Tingofurro,

Thank you for your quick reply and clarification. I will do as you suggested.

Chris