cognos-io / chat.cognos.io

0 stars 0 forks source link

Rate limit completions endpoint #83

Closed kisamoto closed 1 month ago

kisamoto commented 1 month ago

This PR adds rate limiting on to the chat completions endpoint and shows an error message if the rate limit is reached. It does not show the user the current rate limit as this may change but it is currently set to:

Rate:      60.0 / 3600.0,    // 60 requests per 3600 seconds
Burst:     30,               // Allows a portion of the requests to be used in bursts

Screencast from 2024-05-28 14-31-48.webm