pltrdy / files2rouge

Calculating ROUGE score between two files (line-by-line)
MIT License
191 stars 53 forks source link

Do not work with non english word (Russian) #32

Closed DenisOgr closed 5 years ago

DenisOgr commented 5 years ago

This is the library does not work with Russian words. I tried with the same reference and candidate texts:

я иду на работу

And getting this logs:

root@00046ff67f96:/etc/rouge# ./run  ./data/summary ./data/reference  -a "-c 96 -v  -n 2 -a"
Preparing documents... 0 line(s) ignored
Running ROUGE...
@Eval (1)
***P /tmp/tmpdgp6q634/system/s.0.txt

_cn_|0

_cn_|0
***M /tmp/tmpdgp6q634/model/m.A.0.txt
total 1-gram model count: 0
total 1-gram peer count: 0
total 1-gram hit: 0
total ROUGE-1-R: 0.00000
total ROUGE-1-P: 0.00000
total ROUGE-1-F: 0.00000
1.1
---------------------------------------------
1 ROUGE-1 Average_R: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-1 Average_P: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-1 Average_F: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
@Eval (1)
***P /tmp/tmpdgp6q634/system/s.0.txt

_cn_|0

_cn_|0
***M /tmp/tmpdgp6q634/model/m.A.0.txt
total 2-gram model count: 0
total 2-gram peer count: 0
total 2-gram hit: 0
total ROUGE-2-R: 0.00000
total ROUGE-2-P: 0.00000
total ROUGE-2-F: 0.00000
1.1
---------------------------------------------
1 ROUGE-2 Average_R: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-2 Average_P: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-2 Average_F: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
***P /tmp/tmpdgp6q634/system/s.0.txt

0:
LCS: -
***M /tmp/tmpdgp6q634/model/m.A.0.txt

0:
total ROUGE-L model count: 0
total ROUGE-L peer count: 0
total ROUGE-L hit: 0
total ROUGE-L-R score: 0.00000
total ROUGE-L-P: 0.00000
total ROUGE-L-F: 0.00000
1.1
---------------------------------------------
1 ROUGE-L Average_R: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-L Average_P: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-L Average_F: 0.00000 (96%-conf.int. 0.00000 - 0.00000)

Is this know's issue or I made some wrong?

pltrdy commented 5 years ago

Hi,

I never experimented with non-latin chars actually. We should try to run the ROUGE command to see whether the problems comes from ROUGE-1.5.5 or the wrapper files2rouge

DenisOgr commented 5 years ago

@pltrdy I checked it (using logs from pyrouge.Rouge155) and I think, bugs in the pl script (files are normal):

root@1d0ed63a3429:/etc/rouge# cat ./data/reference
я иду на работу. я пришел на работу.
root@1d0ed63a3429:/etc/rouge# clear
root@1d0ed63a3429:/etc/rouge# cat ./data/reference
я иду на работу. я пришел на работу.
root@1d0ed63a3429:/etc/rouge# cat ./data/summary
я иду на работу. я пришел на работу.
root@1d0ed63a3429:/etc/rouge# ./run --ignore_empty  ./data/summary ./data/reference  -a "-c 96  -n 2 -a"
Preparing documents... 0 line(s) ignored
Running ROUGE...
10
2019-10-21 13:46:10,696 [MainThread  ] [INFO ]  Set ROUGE home directory to /root/.files2rouge.
2019-10-21 13:46:10,696 [MainThread  ] [INFO ]  Writing summaries.
2019-10-21 13:46:10,697 [MainThread  ] [INFO ]  Processing summaries. Saving system files to /tmp/tmp0qppezqq/system and model files to /tmp/tmp0qppezqq/model.
2019-10-21 13:46:10,697 [MainThread  ] [INFO ]  Processing files in /tmp/tmppprkkohf/system.
2019-10-21 13:46:10,697 [MainThread  ] [INFO ]  Processing s.0.txt.
2019-10-21 13:46:10,698 [MainThread  ] [INFO ]  Saved processed files to /tmp/tmp0qppezqq/system.
2019-10-21 13:46:10,698 [MainThread  ] [INFO ]  Processing files in /tmp/tmppprkkohf/model.
2019-10-21 13:46:10,698 [MainThread  ] [INFO ]  Processing m.A.0.txt.
2019-10-21 13:46:10,699 [MainThread  ] [INFO ]  Saved processed files to /tmp/tmp0qppezqq/model.
2019-10-21 13:46:10,700 [MainThread  ] [INFO ]  Written ROUGE configuration to /tmp/tmpdvvnebb9/rouge_conf.xml
2019-10-21 13:46:10,700 [MainThread  ] [INFO ]  Running ROUGE with command /root/.files2rouge/ROUGE-1.5.5.pl -e /root/.files2rouge/data -c 96 -n 2 -a -m /tmp/tmpdvvnebb9/rouge_conf.xml
---------------------------------------------
1 ROUGE-1 Average_R: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-1 Average_P: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-1 Average_F: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
---------------------------------------------
1 ROUGE-2 Average_R: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-2 Average_P: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-2 Average_F: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
---------------------------------------------
1 ROUGE-L Average_R: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-L Average_P: 0.00000 (96%-conf.int. 0.00000 - 0.00000)
1 ROUGE-L Average_F: 0.00000 (96%-conf.int. 0.00000 - 0.00000)

Elapsed time: 0.091 seconds
root@1d0ed63a3429:/etc/rouge# cat /tmp/tmpdvvnebb9/rouge_conf.xml
<ROUGE-EVAL version="1.55">
    <EVAL ID="1">
        <MODEL-ROOT>/tmp/tmp0qppezqq/model</MODEL-ROOT>
        <PEER-ROOT>/tmp/tmp0qppezqq/system</PEER-ROOT>
        <INPUT-FORMAT TYPE="SEE">
        </INPUT-FORMAT>
        <PEERS>
            <P ID="1">s.0.txt</P>
        </PEERS>
        <MODELS>
            <M ID="A">m.A.0.txt</M>
        </MODELS>
    </EVAL>
</ROUGE-EVAL>root@1d0ed63a3429:/etc/rouge# cat /tmp/tmp0qppezqq/model/m.A.0.txt
<html>
<head>
<title>dummy title</title>
</head>
<body bgcolor="white">
<a name="1">[1]</a> <a href="#1" id=1>я иду на работу. я пришел на работу.</a>
<a name="2">[2]</a> <a href="#2" id=2></a>
</body>
</html>root@1d0ed63a3429:/etc/rouge# cat /tmp/tmp0qppezqq/system/s.0.txt
<html>
<head>
<title>dummy title</title>
</head>
<body bgcolor="white">
<a name="1">[1]</a> <a href="#1" id=1>я иду на работу. я пришел на работу.</a>
<a name="2">[2]</a> <a href="#2" id=2></a>
</body>
</html>root@1d0ed63a3429:/etc/rouge#
pltrdy commented 5 years ago

Similar 0 scores may happens also if you have some kind of tags in your data e.g. , etc

DenisOgr commented 5 years ago

This is my origin data (there are any tags):

root@1d0ed63a3429:/etc/rouge# cat ./data/reference
я иду на работу. я пришел на работу.
root@1d0ed63a3429:/etc/rouge# cat ./data/summary
я иду на работу. я пришел на работу.
pltrdy commented 5 years ago

I've been trying with other ROUGE wrapper, the error is the same. Probably because of ROUGE indeed. Not much I can do unfortunately.

santaonchair commented 3 years ago

I had a same problem with processing Korean.. I solved it by replacing tokens to ids (number)

$ reference = "я иду на работу. я пришел на работу."
$ summary = "я иду на работу. я пришел на работу."

$ tokens = list(set(reference.split() + summary.split()))
$ token2ids = {token: str(ids) for ids, token in enumerate(temp_tokens)}

$ reference = " ".join([token2ids[token] for token in reference.split()])
$ summary = " ".join([token2ids[token] for token in summary.split()])