huggingface / evaluation-guidebook

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
Other
583 stars 35 forks source link

Fix typos detected by `codespell` #16

Closed tancnle closed 6 days ago

tancnle commented 6 days ago

Fix typos

❯ codespell --skip "*.csv,*.html,*.css,*.js,*.json,*.txt,node_modules"
./README.md:51: Thinkg ==> Think, Thing, Things
./contents/Troubleshooting/Troubleshooting reproducibility.md:58: infering ==> inferring
./contents/Troubleshooting/Troubleshooting reproducibility.md:88: weigths ==> weights
./contents/Model as a judge/Basics.md:21: reproduciblity ==> reproducibility
./contents/Model as a judge/Evaluating your evaluator.md:15: comparision ==> comparison
./contents/Model as a judge/Evaluating your evaluator.md:24: comparision ==> comparison
./contents/Model as a judge/Evaluating your evaluator.md:24: litterature ==> literature
./contents/Automated benchmarks/Some evaluation datasets.md:38: arithmetics ==> arithmetic
./contents/Automated benchmarks/Some evaluation datasets.md:38: substraction ==> subtraction
./contents/Automated benchmarks/Some evaluation datasets.md:49: LSAT ==> LAST, SLAT, SAT
./contents/Automated benchmarks/Some evaluation datasets.md:50: mispellings ==> misspellings
./contents/Automated benchmarks/Some evaluation datasets.md:50: liekly ==> likely
./contents/Automated benchmarks/Some evaluation datasets.md:86: ROUGE ==> ROGUE
./contents/Automated benchmarks/Some evaluation datasets.md:101: interprate ==> interpret
./contents/Automated benchmarks/Basics.md:27: ROUGE ==> ROGUE
./contents/Automated benchmarks/Designing your automatic evaluation.md:34: LSAT ==> LAST, SLAT, SAT
./contents/Automated benchmarks/Designing your automatic evaluation.md:101: ROUGE ==> ROGUE
./contents/Automated benchmarks/Designing your automatic evaluation.md:101: comparisions ==> comparisons
./contents/Automated benchmarks/Tips and tricks.md:30: comparision ==> comparison
./contents/Automated benchmarks/Tips and tricks.md:59: succintly ==> succinctly