HTTP 429 Too Many Requests

Before submitting your bug report

[X] I believe this is a bug. I'll try to join the Continue Discord for questions
[X] I'm not able to find an open issue that reports the same bug
[X] I've seen the troubleshooting guide on the Continue Docs

Relevant environment info

- OS:Windows 11
- Continue:0.8.52
- IDE:VS Code 1.92.1
- Model:llama-3.1-8b-instant
- config.json:

  "tabAutocompleteModel": {
    "title": "llama-3.1-8b-instant",
    "model": "llama3.1-8b",
    "contextLength": 131072,
    "apiKey": "********",
    "completionOptions": {},
    "provider": "groq"

Description

This occurred right after I opened a txt file to edit a couple of things. I wasn't using Continue or attempting any tab completions. As a matter of fact, I didn't even have the side tab with Continue open.

Looking at the 'Continue: LLM...' it appears that Continue sent* (see below) the contents of the TXT file I was reading over and over again until Groq's request limit prevented it from sending anymore. I received 5 of the HTTP 429 error messages (see screenshot below) all at once, so my guess is that it was sending the requests back to back without stopping.

Continue sent to Groq the following on repeat:
"==========================================================================

Settings: contextLength: 131072 model: llama3.1-8b maxTokens: 2048 temperature: 0.01 stop: ,,,,<|endoftext|>,,,

,/src/,#- coding: utf-8,```, function, class, module, export, import raw: true log: undefined

############################################

"The content of my txt..." ...then it would repeat from the beginning..." ![Screenshot 2024-09-21 195727](https://github.com/user-attachments/assets/f1b13c70-c78d-478e-afcc-4ab7b4ba0df3) ### To reproduce _No response_ ### Log output ```Shell console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.969s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19) at processTicksAndRejections (node:internal/process/task_queues:95:5) at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26) at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26) at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26) at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26) at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28) C @ console.ts:137 notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.969s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} c @ notificationsAlerts.ts:42 console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.287s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19) at processTicksAndRejections (node:internal/process/task_queues:95:5) at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26) at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26) at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26) at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26) at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28) C @ console.ts:137 notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.287s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} c @ notificationsAlerts.ts:42 console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 399ms. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19) at processTicksAndRejections (node:internal/process/task_queues:95:5) at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26) at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26) at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26) at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26) at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28) C @ console.ts:137 notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 399ms. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} c @ notificationsAlerts.ts:42 console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.042s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19) at processTicksAndRejections (node:internal/process/task_queues:95:5) at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26) at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26) at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26) at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26) at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28) C @ console.ts:137 notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.042s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} c @ notificationsAlerts.ts:42 console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.006s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19) at processTicksAndRejections (node:internal/process/task_queues:95:5) at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26) at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26) at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26) at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26) at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28) C @ console.ts:137 notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.006s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} c @ notificationsAlerts.ts:42 console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 526ms. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19) at processTicksAndRejections (node:internal/process/task_queues:95:5) at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26) at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26) at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26) at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26) at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28) C @ console.ts:137 notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 526ms. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}} ```

continuedev / continue