The problem of calling APIs too fast

tuoyuan123 commented 1 year ago

When sending the command 'run 1', too many calls to the openai API resulted in access being denied. The error is as follows:

openai.error.RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-3AW0AYkBXyFwNdJ9WLlqK5Mo on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.

tuoyuan123 commented 1 year ago

We can certainly use a paid account to increase API restrictions, but can we optimize the frequency of API calls and demonstrate how many APIs need to be called per second to run this project? Thank you for your efforts.

iam153 commented 1 year ago

another thought is that we can have a sleep technique, when hitting the chatgpt speed limit, it can hold the process and retry after 20s.

kelilive commented 1 year ago

@iam153

def get_embedding(text, model="text-embedding-ada-002"):
    text = text.replace("\n", " ")
    time.sleep(20)
    if not text:
        text = "this is blank"
    return openai.Embedding.create(
        input=[text], model=model)['data'][0]['embedding']

But when I run "run 2" it says "xxx is TOKEN LIMIT EXCEEDED", I don't know how to remove this limitation, a little slower speed is acceptable, I hope I can view the demo normally.

error.log

MTDickens commented 1 year ago

Me too. The first two txts received proper output, but the rest are "TOKEN LIMIT EXCEEDED"s.

Enter option: run 1
GNS FUNCTION: <generate_wake_up_hour>
TOKEN LIMIT EXCEEDED
TOKEN LIMIT EXCEEDED
TOKEN LIMIT EXCEEDED
TOKEN LIMIT EXCEEDED
TOKEN LIMIT EXCEEDED
=== persona/prompt_template/v2/wake_up_hour_v1.txt
~~~ persona    ---------------------------------------------------
Isabella Rodriguez 

~~~ gpt_param ----------------------------------------------------
{'engine': 'text-davinci-002', 'max_tokens': 5, 'temperature': 0.8, 'top_p': 1, 'stream': False, 'frequency_penalty': 0, 'presence_penalty': 0, 'stop': ['\n']} 

~~~ prompt_input    ----------------------------------------------
[...]

~~~ prompt    ----------------------------------------------------
...
~~~ output    ----------------------------------------------------
8

=== END ==========================================================

GNS FUNCTION: <generate_first_daily_plan>
TOKEN LIMIT EXCEEDED
=== persona/prompt_template/v2/daily_planning_v6.txt
~~~ persona    ---------------------------------------------------
Isabella Rodriguez

~~~ gpt_param ----------------------------------------------------
{'engine': 'text-davinci-003', 'max_tokens': 500, 'temperature': 1, 'top_p': 1, 'stream': False, 'frequency_penalty': 0, 'presence_penalty': 0, 'stop': None} 

~~~ prompt_input    ----------------------------------------------
[...] 

~~~ prompt    ----------------------------------------------------
...
~~~ output    ----------------------------------------------------
['wake up and complete the morning routine at 8:00 am']

=== END ==========================================================

GNS FUNCTION: <generate_hourly_schedule>
TOKEN LIMIT EXCEEDED
=== persona/prompt_template/v2/generate_hourly_schedule_v2.txt
~~~ persona    ---------------------------------------------------
Isabella Rodriguez

~~~ gpt_param ----------------------------------------------------
{'engine': 'text-davinci-003', 'max_tokens': 50, 'temperature': 0.5, 'top_p': 1, 'stream': False, 'frequency_penalty': 0, 'presence_penalty': 0, 'stop': ['\n']}

~~~ prompt_input    ----------------------------------------------
[...]

~~~ prompt    ----------------------------------------------------
Hourly schedule format:
...

~~~ output    ----------------------------------------------------
TOKEN LIMIT EXCEEDED

=== END ==========================================================

mxdlzg commented 1 year ago

Me too. The first two txts received proper output, but the rest are "TOKEN LIMIT EXCEEDED"s.

Enter option: run 1
GNS FUNCTION: <generate_wake_up_hour>
TOKEN LIMIT EXCEEDED
TOKEN LIMIT EXCEEDED
TOKEN LIMIT EXCEEDED
TOKEN LIMIT EXCEEDED
TOKEN LIMIT EXCEEDED
=== persona/prompt_template/v2/wake_up_hour_v1.txt
~~~ persona    ---------------------------------------------------
Isabella Rodriguez 

~~~ gpt_param ----------------------------------------------------
{'engine': 'text-davinci-002', 'max_tokens': 5, 'temperature': 0.8, 'top_p': 1, 'stream': False, 'frequency_penalty': 0, 'presence_penalty': 0, 'stop': ['\n']} 

~~~ prompt_input    ----------------------------------------------
[...]

~~~ prompt    ----------------------------------------------------
...
~~~ output    ----------------------------------------------------
8

=== END ==========================================================

GNS FUNCTION: <generate_first_daily_plan>
TOKEN LIMIT EXCEEDED
=== persona/prompt_template/v2/daily_planning_v6.txt
~~~ persona    ---------------------------------------------------
Isabella Rodriguez

~~~ gpt_param ----------------------------------------------------
{'engine': 'text-davinci-003', 'max_tokens': 500, 'temperature': 1, 'top_p': 1, 'stream': False, 'frequency_penalty': 0, 'presence_penalty': 0, 'stop': None} 

~~~ prompt_input    ----------------------------------------------
[...] 

~~~ prompt    ----------------------------------------------------
...
~~~ output    ----------------------------------------------------
['wake up and complete the morning routine at 8:00 am']

=== END ==========================================================

GNS FUNCTION: <generate_hourly_schedule>
TOKEN LIMIT EXCEEDED
=== persona/prompt_template/v2/generate_hourly_schedule_v2.txt
~~~ persona    ---------------------------------------------------
Isabella Rodriguez

~~~ gpt_param ----------------------------------------------------
{'engine': 'text-davinci-003', 'max_tokens': 50, 'temperature': 0.5, 'top_p': 1, 'stream': False, 'frequency_penalty': 0, 'presence_penalty': 0, 'stop': ['\n']}

~~~ prompt_input    ----------------------------------------------
[...]

~~~ prompt    ----------------------------------------------------
Hourly schedule format:
...

~~~ output    ----------------------------------------------------
TOKEN LIMIT EXCEEDED

=== END ==========================================================

TOKEN LIMIT EXCEEDED is the default exception details in gpt_structure.py(line 245). So, Some errors may have occurred.

krrishdholakia commented 10 months ago

@tuoyuan123 if you're able to create multiple azure embedding instances - you could use LiteLLM's openai-compatible server to loadbalance across from them:

Step 1: Put your instances in a config.yaml

model_list:
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8001
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8002
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8003

Step 2: Install LiteLLM

$ pip install litellm

Step 3: Start litellm proxy w/ config.yaml

$ litellm --config /path/to/config.yaml

Docs: https://docs.litellm.ai/docs/simple_proxy

justtiberio commented 8 months ago

I'm having the same issue, all output I get is "TOKEN LIMIT EXCEEDED", anyone found a solution for this?

alextveit commented 8 months ago

I'm having the same issue, all output I get is "TOKEN LIMIT EXCEEDED", anyone found a solution for this?

That's the exact same issue I am having as well.

joonspk-research / generative_agents

The problem of calling APIs too fast #39