Closed saikatmitra91 closed 4 months ago
Since the model calls are wrapped by SDK and the SDK internally retries, the latency time calculation includes the total time taken to get the response and not the time taken by the final request to resolve.
@saikatmitra91 I would like to work on this, can you please assign this to me?
Fixed with #216. Closing.
📝 Description
Since the model calls are wrapped by SDK and the SDK internally retries, the latency time calculation includes the total time taken to get the response and not the time taken by the final request to resolve.
📸 Screenshots / Code Snippets
🛠 Proposed Solution