Closed dbabokin closed 5 months ago
With temperature not set to 0, the result may and will vary from run to run. Better way to report the performance is to report tokens/sec.
Ahh, I checked test_llm.py and didn't see token/s and reported this. Now I see that query_llm.py reports this metric. So, not a problem then.
test_llm.py
token/s
query_llm.py
With temperature not set to 0, the result may and will vary from run to run. Better way to report the performance is to report tokens/sec.