huggingface evaluate issues

huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

https://huggingface.co/docs/evaluate

Apache License 2.0

1.89k stars 234 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Update MAUVE's readme and citations

#605 krishnap25 opened 5 days ago
0
Support tuple format in `combine` evaluations

#604 shunk031 opened 1 week ago
0
AttributeError: 'CombinedEvaluations' object has no attribute 'evaluation_modules'

#603 shunk031 opened 2 weeks ago
2
Gradio dependency issue

#602 bnaman50 opened 2 weeks ago
1
Evaluation of form feed symbol with BLEU results in error

#601 lowlypalace opened 3 weeks ago
0
feat(ci): add trufflehog secrets detection

#600 McPatate closed 4 weeks ago
0
Addition of SummEval Metric to `evaluate` Library

#599 penguinwang96825 opened 1 month ago
0
New JSON encoder for results containing numpy types

#598 jpodivin opened 1 month ago
0
LocalModuleTest.test_load_metric_code_eval fails with "The "code_eval" metric executes untrusted model-generated code in Python."

#597 jpodivin opened 1 month ago
0
Enhancing `TextClassificationEvaluator` to Support Averaged Metrics

#596 ilyesdjerfaf opened 1 month ago
0
Add tokenizer initialization to the documented example

#595 jpodivin opened 1 month ago
0
Execution of example from the Using the evaluator docs fails due to unspecified tokenizer

#594 jpodivin opened 1 month ago
0
Evaluation of empty strings with MAUVE results in error

#593 lowlypalace opened 1 month ago
0
SyntaxError: closing parenthesis '}'

#592 wangxiuwen opened 1 month ago
3
Silence NLTK download info

#591 bobbywlindsey opened 2 months ago
0
Problems during run initial step

#590 simplelifetime opened 2 months ago
12
Can't load exist dataset for evaluation

#589 IsmaelMousa opened 2 months ago
1
Super tiny fix typo

#588 fzyzcjy opened 2 months ago
0
set dev version

#587 lhoestq opened 2 months ago
0
[Question]Shall we adding a faster BLEU score calculator?

#586 shenxiangzhuang opened 2 months ago
0
Metrics for multilabel problems don't match the expected format.

#585 adamamer20 closed 2 months ago
2
Add raw pyarrow type check message for `EvaluationModule`

#584 shenxiangzhuang opened 2 months ago
0
fix: install cmd for extra pkg

#583 shenxiangzhuang opened 2 months ago
0
How to pass generation_kwargs to the TextGeneration evaluator ?

#582 swarnava112 opened 2 months ago
0
[FR] Confidence intervals for metrics

#581 NightMachinery opened 2 months ago
0
Apply deprecated `evaluation_strategy`

#580 muellerzr opened 2 months ago
1
Release: 0.4.2

#579 lhoestq closed 2 months ago
3
Fix FileFreeLock

#578 lhoestq closed 2 months ago
0
Fix wrong lib name in ImportError mesage

#577 milistu opened 2 months ago
0
Unable to run pip install evaluate[template]

#576 saicharan2804 opened 2 months ago
1
Fix example doc in load function

#575 alexrs closed 2 months ago
0
Module 'glue' doesn't exist on the Hugging Face Hub either.

#574 enori closed 2 months ago
1
Perplexity metric does not apply batching correctly to tokenization

#573 ChengSashankh opened 2 months ago
1
METEOR has no option to return unaggregated results

#572 ashtonomy opened 2 months ago
0
Update python to 3.8

#571 qubvel closed 2 months ago
5
[Question] How to have no preset values sent into `.compute()`

#570 alvations opened 3 months ago
0
Speeding up mean_iou metric computation

#569 qubvel closed 2 months ago
0
Allow for specify coda device in perplexity evaluation

#568 manuelbrack opened 3 months ago
0
Cannot use it offline!

#567 SirryChen opened 3 months ago
1
Shouldn't perplexity range from [1 to inf)?

#566 ivanmkc closed 3 months ago
2
Can't use the BLEU offline.

#565 Zhuxing01 opened 3 months ago
3
Does Rouge score support the multilingual language?

#564 sanjeev-bhandari closed 3 months ago
1
ValueError: Predictions and/or references don't match the expected format.

#563 antopost opened 3 months ago
1
ImportError: To be able to use evaluate-metric/rouge, you need to install the following dependencies['nltk'] using 'pip install # Here to have a nice missing dependency error message early on' for instance'

#562 BAEK26 closed 3 months ago
2
It seems like evaluate.load doesnt use

#561 anhq-nguyen opened 3 months ago
0
Is perplexity correctly computed?

#560 halixness opened 3 months ago
4
evaluate consuming Memory and slow down the process

#559 Redix8 closed 4 months ago
0
the difference of your bleu and sacrebleu

#558 cooper12121 opened 4 months ago
1
Add Diarization Error Rate (DER) metric

#557 MedAhmedKrichen opened 4 months ago
0
Add Frechet Inception Distance (FID) Score

#556 MedAhmedKrichen opened 4 months ago
0