LTTLabsOSS / markbench-tests

Home of test harnesses used in LTT Labs MarkBench
GNU General Public License v3.0
634 stars 28 forks source link

Feature Request: Add a tokens per second evaluation with Local Large Langage Models #21

Closed ntindle closed 11 months ago

ntindle commented 11 months ago

As recent threadripper videos have referenced, there are some chips and gpus that are targets at AI Engineers.

It would be nice for there to be a standardized test harness for evaluating Hardware for LLM usage.

Here's a bit of background on the goals and objectives of evaluating an LLM: https://www.baseten.co/blog/llm-transformer-inference-guide/

I will note this is a complex and ever-evolving field and fully understand if this is closed without comment.

I also acknowledge this may not be something you want to be wholly responsible for and instead call to a service that does this.

nharris-lmg commented 11 months ago

We are looking at adding machine learning tests to our suite, but are not quite ready yet :)