ray-project / llmperf-leaderboard

Apache License 2.0
430 stars 13 forks source link

Throughput of llama2 70b higher than llama2 7b #11

Open debraj135 opened 9 months ago

debraj135 commented 9 months ago

I was wondering how to understand this. I would expect llama2 70b to have a lower throughput.

Is the configuration different between the table for llama2 70b and the table for llama2 7b.

xieus commented 8 months ago

The performance of various models relies on different factors, including model size, compute configuration (gpu model and counts), and model, system or algorithm optimizations. An API provider may have a different strategy to optimize different model.