999.00 TPS - Githubissues

fixie-ai / thefastest.ai

Website with current metrics on the fastest AI models.

MIT License

29 stars 3 forks source link

999.00 TPS #6

Closed cailloux2 closed 3 months ago

cailloux2 commented 4 months ago

Thanks for putting together this live dashboard. Brilliant and very useful.

I'm wondering if the 999.00 TPS is real or a display/measurement bug. I doubt that gpt-3.5-turbo-1106 is in that class or performers 😂 And also since the other benchmarks have very unique and precise values (like 654.74, etc), having a group all at 999 seems not right. Just wanted to check and confirm.

Thanks again!!!

juberti commented 4 months ago

999 is the maximum when all the tokens show up at once, to prevent crazy values like 12847 from being displayed. It doesn't necessarily mean the model is fast if TTFT is also high, since you can get a high TPS score just by buffering the tokens.

However, I see that there are multiple 999 TPS scores today, which suggests that there may be some measurement bug occurring. I'll take a look.

juberti commented 4 months ago

Updated our benchmark runners to higher-performance machines, and I'm not seeing this issue after the last data run (except for models on Azure known to dump tokens all at once). Please reopen if you see this issue again.

cailloux2 commented 4 months ago

Makes sense, thanks for looking into this. I confirm I see Azure with no streaming, but I also see Llama3 via Groq (they are very fast, but not that fast, right? 😛). thanks again for putting this together!

juberti commented 4 months ago

Groq is extremely fast in generation (often 500+ tps) and could definitely be hitting our cap for a smaller model like llama-3-8b.

juberti commented 3 months ago

Made some changes here to spread the LLM calls over a longer time window to prevent any slowness from local processing from creeping into the results. As of now, the only 999.00 TPS values seem legit.