csebuetnlp / xl-sum

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
https://aclanthology.org/2021.findings-acl.413/
256 stars 41 forks source link

how to calculate rouge score for all languages in XL-Sum Dataset? #12

Closed lixnvege closed 1 year ago

lixnvege commented 1 year ago

Hi,

I am writing to seek clarification regarding the calculation of Rouge scores for various languages in the XL-Sum dataset. While reviewing the provided toolkit, I noticed that it appears to only support a limited list of languages. However, I also observed that Rouge scores have been reported for all languages, which has led to some confusion on my part.

Could you kindly provide me with further details on how the Rouge scores were computed for languages not supported by the toolkit?

Like language igbo the code looks like:

calculate_rouge(["Akụkọ kachasị n'abalị: Obasanjo akpọpụtala Buhari ọzọ", "Nwoke agbaala nwaanyị na ezinaụlọ ya ọkụ maka ịjụ ya enyi"],
    ["Akụkọ ndị kachasị mkpa mere taa n'ichafu:", "Akụkọ sepụtara isi n'ụtụtụ a"],rouge_lang="igbo")

and the output is:

UserWarning: -----unknown stemmer language-> igbo-----
  warnings.warn(
{'rouge1': 18.75, 'rouge2': 0.0, 'rougeL': 18.75, 'rougeLsum': 18.75}

Did I misunderstand something?

I really appreciate any clarification or guidance.

abhik1505040 commented 1 year ago

Hi @lixnvege,

We simply calculated the rouge score without any stemming (i.e. on the raw tokenized ngrams) for the languages without a corresponding stemmer implementation.

lixnvege commented 1 year ago

@abhik1505040 Thank you!

abhik1505040 commented 1 year ago

Closing the issue. Please feel free to reopen if you have any further questions.