Rate limit - Githubissues

Cerlancism / chatgpt-subtitle-translator

Efficient translation tool based on ChatGPT API

https://cerlancism.github.io/chatgpt-subtitle-translator/

MIT License

145 stars 16 forks source link

Rate limit #9

Closed haitranvua closed 5 months ago

haitranvua commented 10 months ago

I am using your function to translate an .srt file (about 100 lines of text) with the free OpenAI API, and I often hit the limit right from the first translation. I wonder if there's any way to address this.

Cerlancism commented 10 months ago

The free tier trial API is having a very restrictive rate limiting of 3 requests per minute and 200 requests per day https://platform.openai.com/docs/guides/rate-limits/what-are-the-rate-limits-for-our-api

Set OPENAI_API_RPM to 3 or less in your .env file:

OPENAI_API_RPM=3

If you still hit rate limit, means you could have exhausted the 200 requests per day quota. Wait for next day or upgrade your plan.

haitranvua commented 10 months ago

Thank you for your response.

I don't think this is a silly idea at all. I realize there is a more efficient way to use the rate limit.

The gpt-3.5-turbo model can send a maximum of 4097 tokens per request. My .srt file is calculated to be around 10,000 tokens. So, it should be possible to process the .srt file with just 2 requests. is it work ?

Cerlancism commented 10 months ago

You mentioned that your .srt file is around 100 text line count or timestamp entries? This application is already set to send 100 timestamp entries by default which should likely complete the run within 2 queries, unless the batch size is reduced due to mismatched output quantity causing timing alignment issues (this cannot be easily fixed https://github.com/Cerlancism/chatgpt-subtitle-translator/issues/1), though batch size can be increased to maybe 150. You can also try turn off history prompt, moderator and line quantity matching, at the expense of higher chance of timing alignment issues.

--batch-sizes "[150]" --history-prompt-length 0 --no-use-moderator --no-line-matching

Cerlancism commented 10 months ago

Also to correct that,

The gpt-3.5-turbo model can send a maximum of 4097 tokens per request.

4097 tokens consists of both input and output tokens, not just input https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them

Depending on the model used, requests can use up to 4097 tokens shared between prompt and completion. If your prompt is 4000 tokens, your completion can be 97 tokens at most.