UKGovernmentBEIS / inspect_ai

Inspect: A framework for large language model evaluations
https://inspect.ai-safety-institute.org.uk/
MIT License
567 stars 98 forks source link

Migrate MATH benchmark to Inspect Evals #604

Closed dragonstyle closed 2 days ago

jjallaire-aisi commented 2 days ago

Reminder to make the requirement imports truly optional.