DCGM / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
0 stars 2 forks source link

Add czechbench #5

Closed DavidAdamczyk closed 2 months ago

DavidAdamczyk commented 3 months ago

I would like to add new benchmarks from czechbench:

Can you please do a review or merge?

DavidAdamczyk commented 3 months ago

@MFajcik The datasets was shuffled. I hope that is correct.

DavidAdamczyk commented 3 months ago

@MFajcik Can you please do a review?

MFajcik commented 2 months ago

Looks good to me! Thank you @DavidAdamczyk !