Open Naeemkh opened 1 month ago
Can you say more about what you mean by benchmarking? I agree that this is a good enhancement, but are you thinking about benchmarking model performance using standard LLM evaluations, benchmarking data serving speed, tokenization speed, or a different metric?
It would be beneficial to add benchmarking infrastructure, allowing users to test their developed models on the cluster using predefined, agreed-upon benchmarks.