empirical-run / empirical

Test and evaluate LLMs and model configurations, across all the scenarios that matter for your application
https://docs.empirical.run
MIT License
148 stars 13 forks source link

Model latency numbers include time taken to retry #166

Closed saikatmitra91 closed 4 months ago

saikatmitra91 commented 5 months ago

📝 Description

Since the model calls are wrapped by SDK and the SDK internally retries, the latency time calculation includes the total time taken to get the response and not the time taken by the final request to resolve.

📸 Screenshots / Code Snippets

Screenshot 2024-04-21 at 11 41 36 AM

🛠 Proposed Solution

sumitd94 commented 5 months ago

@saikatmitra91 I would like to work on this, can you please assign this to me?

arjunattam commented 4 months ago

Fixed with #216. Closing.