huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
https://huggingface.co/docs/evaluate
Apache License 2.0
1.9k stars 235 forks source link

[Question]Shall we adding a faster BLEU score calculator? #586

Open shenxiangzhuang opened 2 months ago

shenxiangzhuang commented 2 months ago

Shall we?

Hi, I found the performance request about the bleu score. And I'm wondering if we should add a fast-bleu or replace the bleu implementation with a faster one, like bleuscore which is aimed for better performance for large data size.

About bleuscore library

Correctness

I implement bleuscore library with exactly same way as evaluate library and do serious test with hypothesis to make sure they return same result.

Performance

I wrote the library in Rust and used multi-threading for parallel execution, so it has better performance when the data size gets larger, you can check the benchmark section for more details.

Benchmark