KempnerInstitute / tatm

Python Package for the Kempner AI Testbed
https://kempnerinstitute.github.io/tatm/
MIT License
4 stars 0 forks source link

Benchmarking infrastructure #73

Open Naeemkh opened 1 month ago

Naeemkh commented 1 month ago

It would be beneficial to add benchmarking infrastructure, allowing users to test their developed models on the cluster using predefined, agreed-upon benchmarks.

mbsabath commented 1 month ago

Can you say more about what you mean by benchmarking? I agree that this is a good enhancement, but are you thinking about benchmarking model performance using standard LLM evaluations, benchmarking data serving speed, tokenization speed, or a different metric?