Open pltrdy opened 7 years ago
Hi! Not the repo owner, but I make heavy use of ROUGE and I would be really interested in a fast, python implementation of ROUGE (in fact I was considering writing it myself). I am willing to help in benchmarking or any other way.
TL;DR: I'm looking for the best ROUGE scoring python i.e. the best function sentence1, sentence2: set of rouge scores
.
I am currently using pltrdy/pythonrouge to as a ROUGE wrapper. It is forked from tagucci/pythonrouge but the two repos are not longer close to each other.
Additionally, I am running files scoring using my own implementation (pltrdy/files2rouge). I built it from scratch, which it reads the files and launch, for each line pair a scoring using pltrdy/pythonrouge
.
Scoring are done in parallel using multithreading.
Now I plan to kinda "merge" pythonrouge
and files2rouge
in order to build a single library to
My current bottleneck is on the evalutation itself, i.e. the function s1, s2 -> rouge scores
. I would then just need to plug the optimal evaluation into my higher level solutions.
We certainly could work together on it, both a benchmark and a solution to ROUGE related problems
pltrdy
My first step will be to compare my files2rouge vs https://pypi.python.org/pypi/pyrouge/0.1.0
I've built a fully python rouge implementation: https://github.com/pltrdy/rouge
Way faster than PERL wrapping.
Hi,
Would you be interested in benchmarking Python ROUGE wrappers?
I worked on my own solution for ROUGE evaluation over files (using two files
f1
,f2
, reading each linel1_i in f1
,l2_i in f2
and scoringrouge(l1_i, l2_i)
.My system uses multithreading to speed up the process (good gain in practice as the wrapping require I/O).
Do you have any information about processing speed using your solution? We could run experiments using a common process & data to compare. Note that I don't want my solution "win", I just want the fastest python ROUGE scorer.
Regards