stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
14.05k stars 1.07k forks source link

Backing off.. #176

Open eyast opened 8 months ago

eyast commented 8 months ago

Hi - I am trying to build a simple program and whenever I try to compile it, I receive this output: 0%| | 0/20 [00:00<?, ?it/s] Backing off 0.4 seconds after 1 tries calling function <function GPT3.request at 0x0000021EB37F6EE0> with kwargs {} Backing off 0.3 seconds after 2 tries calling function <function GPT3.request at 0x0000021EB37F6EE0> with kwargs {} Backing off 0.4 seconds after 3 tries calling function <function GPT3.request at 0x0000021EB37F6EE0> with kwargs {} Backing off 0.3 seconds after 4 tries calling function <function GPT3.request at 0x0000021EB37F6EE0> with kwargs {}

The program is: `from datasets import load_dataset import dspy import openai import os from dspy.teleprompt import BootstrapFewShot

openai_key = os.getenv('OPENAI_API_KEY')

lm = dspy.OpenAI(model='gpt-4', api_key=openai_key) dspy.settings.configure(lm= lm) dataset = load_dataset("ccdv/arxiv-summarization") num_dev = 20 dev_set = [] for i in range(num_dev, 2 * num_dev): article = dataset["train"][i]["article"] abstract = dataset["train"][i]["abstract"] example = dspy.Example(article=article, abstract=abstract).with_inputs("article") dev_set.append(example)

class BasicSummary(dspy.Signature): """Summarize a long paper into its abstract""" article = dspy.InputField(prefix="the paper") abstract = dspy.OutputField(desc="the abstract section of the paper")

class Summarizer(dspy.Module): def init(self): super().init() self.generate_summary = dspy.Predict(BasicSummary)

def forward(self, article):
    pred = self.generate_summary(article= article)
    return dspy.Prediction(abstract = pred.abstract)

def make_summaries_100_words_long(summary, pred, trace=None): summary = summary.split() pred = pred.split() return len(summary) == 100

teleprompter = BootstrapFewShot(metric=make_summaries_100_words_long)

compiled_summarizer = teleprompter.compile(Summarizer(), trainset=dev_set)`

Any thoughts?

okhat commented 8 months ago

Yeah, the issue is that GPT-4 has a very strict rate limit.

You can ask OpenAI to raise your rate limit, or you can just ignore these warnings and let DSPy retry the requests until GPT-4 is done. Did you just try to let it finish?

Alternatively, use GPT-3.5-turbo-instruct for development, and switch to GPT-4 at the end.

(In other words, I don't think this is a DSPy issue, it's just an issue with the rate limit. If you wait on it, it'll finish.)

eyast commented 8 months ago

Clear, thanks Omar! Maybe changing the message printed to the console to say that would help others? i.e. whatever the API returns (like rate limit hit, or HTTP code, or whatever) pass it through to the user. THanks!

thomasahle commented 4 months ago

+1 for changing the message. I often get

Backing off 0.7 seconds after 1 tries calling function <function GPT3.request at 0x1087d51c0> with kwargs {}
Backing off 1.4 seconds after 2 tries calling function <function GPT3.request at 0x1087d51c0> with kwargs {}
Backing off 3.6 seconds after 3 tries calling function <function GPT3.request at 0x1087d51c0> with kwargs {}

And only when I get impatient and press Ctrl+C do I see

openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens. However, you requested 8897 tokens (4801 in the messages, 4096 in the completion). Please reduce the length of the messages or completion.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

If I had known this was the issue, I would have killed it much earlier, since there's no hope of retrying fixing the error.

(For this particular problem of output length there might be an actual solution of moving the generated output to the input and making a new request, like ChatGPT's "continue generation", but that's a much more complicated task. For now just some message other than Backing off would be great!)

okhat commented 4 months ago

@thomasahle Yeah this seems like two different problems!

First, the backoff shouldn't happen on errors like this... Unclear to me why BadRequestError causes a backoff. Maybe it's a recent addition? Or maybe when the API changed it was added? I've never seen this with OpenAI. Is this maybe coming from VLLM?

Second, the backoff error should indeed be more descriptive

okhat commented 4 months ago

https://github.com/stanfordnlp/dspy/blob/9338ab3eb9f44e188ca9d1cd65fe551b2206b1f3/dsp/modules/gpt3.py#L41C5-L41C17

This is where things happen. Interestingly, I don't see a value for the error message, but I'm sure it's accessible in some way.

mrshu commented 4 months ago

Just for posterity and future reference (so that someone can perhaps find this a bit faster), I also ran into the openai.BadRequestError -- in my case it was caused by the "Azure OpenAI's content management policy".

mrshu commented 4 months ago

Unclear to me why BadRequestError causes a backoff.

If I am reading this correctly

https://github.com/stanfordnlp/dspy/blob/9338ab3eb9f44e188ca9d1cd65fe551b2206b1f3/dsp/modules/gpt3.py#L142-L147

It seems that the backing off happens on every exception whose parent is in the ERRORS list (see this snippet in the backoff module's code). This seems to include the APIError in all cases:

https://github.com/stanfordnlp/dspy/blob/9338ab3eb9f44e188ca9d1cd65fe551b2206b1f3/dsp/modules/gpt3.py#L27-L38

As per the OpenAI docs,

All errors inherit from openai.APIError.

It hence seems that including more specific errors in the ERRORS list could fix this issue. Could removing openai.APIError altogether be a good start perhaps?

mrshu commented 4 months ago

Interestingly, I don't see a value for the error message, but I'm sure it's accessible in some way

Indeed, as per the backoff docs:

In the case of the on_exception decorator, all on_backoff and on_giveup handlers are called from within the except block for the exception being handled. Therefore exception info is available to the handler functions via the python standard library, specifically sys.exc_info() or the traceback module. The exception is also available at the exception key in the details dict passed to the handlers. -- https://github.com/litl/backoff/tree/master?tab=readme-ov-file#event-handlers

If you have a specific idea on how to improve the error message, do let me know, I would be happy to try to put together a PR.

mrshu commented 4 months ago

@okhat just a friendly ping, I'd be happy to put something together if it made sense to you 🙂

okhat commented 4 months ago

@mrshu I just pushed to pypi am improvement to backoff. It will only backoff on RateLimitError now.

okhat commented 4 months ago

But improving the message via a PR will be really helpful and appreciated @mrshu !

okhat commented 4 months ago

Then we can close this issue

hitchcock-william commented 2 months ago

@okhat I'm using the most recent version and I get a backoff for a wide variety of errors; context length to long, hitting the TPM limit, straight up not even connected to the internet lol. It makes debugging completely impossible. Not sure if there was a regression or something

pragyan019 commented 2 months ago

Hi @okhat replicated the below example with the latest version of dspy and got the same backing off error

`import datasets ds = datasets.load_dataset("b-mc2/sql-create-context")

import dspy demos = [dspy.Example(**d).with_inputs("context", "question") for d in ds['train']] train, test = demos[:100], demos[100:200]

import dotenv, os dotenv.load_dotenv(os.path.expanduser("~/.env")) # load OpenAI API key from .env file lm = dspy.OpenAI(model="gpt-3.5-turbo", max_tokens=4000) dspy.settings.configure(lm=lm)

from dspy.evaluate import Evaluate exact_match = lambda ex, pred, _trace=None: ex.answer == pred.sql_query

evaluator = Evaluate( devset=test, num_threads=30, metric=exact_match, display_progress=True, ) program = dspy.TypedPredictor("context, question -> sql_query") print(evaluator(program))

from dspy.teleprompt.bootstrap import BootstrapFewShot compiled = BootstrapFewShot( metric=exact_match, max_rounds=1, ).compile( program, trainset=train, ) print(evaluator(compiled))`

vdemchenko3 commented 2 months ago

Hi I am also getting this same backing off output with this error:

AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com/), or have expired.'}

I'm calling an OpenAI via Azure like this:

gpt35 = dspy.AzureOpenAI(deployment_id=deployment_id, api_key=openai_api_base, api_base=openai_api_base, api_version=openai_version, model_type="chat", max_tokens=4000)

The key, endpoint, version, etc. all work in other workflows so I'm not sure what's going wrong here. Any ideas?

Catrovitch commented 1 month ago

For me I got the same message: "Backing off 0.7 seconds after 1 tries calling function <function GPT3.request at 0x1087d51c0> with kwargs {} Backing off 1.4 seconds after 2 tries calling function <function GPT3.request at 0x1087d51c0> with kwargs {} Backing off 3.6 seconds after 3 tries calling function <function GPT3.request at 0x1087d51c0> with kwargs {} ...."

This message didn't inform much on what the underlying problem was. I solved it by editing the "max time" field from 1000 to 1 in the backoff.on_exception : @backoff.on_exception( backoff.expo, ERRORS, max_time=1000 ---> 1, on_backoff=backoff_hdlr, )

This way I was able to see the actual error message, which for me was:

"raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'error': {'code': 'OperationNotSupported', 'message': 'The completion operation does not work with the specified model, gpt-35-turbo. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.'}}"

It wasn't immediately clear from this either what the problem was as trying out different models in the model field for dspy.AzureOpenAi didn't change error. To my understanding (correct me if I am wrong) the "deployment_id" defines the "id" of gpt-instance that has been deployed specifically for your organization's needs on azure. This deployment has a specific model tied to it, which means the "model" field for dspy.AzureOpenAi is overwritten by the model tied to the deployment defined with the deployement_id?

At last, my issue resolved by switching "text" to "chat" as the model used in my deployment wasn't able to handle the model_type="text".

lm = dspy.AzureOpenAI( api_base=f'https://{azure_resource}.openai.azure.com/', api_version='2023-12-01-preview', api_key=azure_api_key, model="gpt-3.5-turbo", <--- no effect when changing this. model_type="text", <---- SOLLUTION FOR ME: Change this to "chat" when using gpt-35-turbo deployment_id=azure_deployment_id)

Please let me know if my deduction is correct or if I am still missing something! :)

vdemchenko3 commented 1 month ago

Hi @Catrovitch, I'm still unable to get the Azure client to work although with a different error:

AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com/), or have expired.'}

even though my keys work with different libraries.

I also don't see any changes by changing the backoff decorator and I keep getting multiple the "backing off" errors past the first one. I even change the max_tries to 0 and still keep seeing errors.

Any advice is much appreciated!

Catrovitch commented 1 month ago

@vdemchenko3, How are you configuring it?

vdemchenko3 commented 1 month ago

Found the mistake while copy/pasting to show you... lol. Thanks for engaging and helping me see it!