Manage backup LLMs if one is rate limited or otherwise failing.

I am catching some exceptions and retrying in the cases of:

openai.APITimeoutError
openai.InternalServerError
openai.RateLimitError
openai.UnprocessableEntityError
openai.OpenAIError

The list of errors is available at: https://platform.openai.com/docs/guides/error-codes/python-library-error-types

I think in the case of OpenAIError (I know I should avoid a general exception( we should just exist (and I will change that accordingly). Otherwise, it can happen that we get stuck on sending a lot of requests to the proxy server -- it happened once when we had a wrong key problem.

So in the case of the rate limit, we are waiting 30 seconds and trying again. If we get that error, they will block us for 60 seconds, so we can wait longer. I am trying 5 times and then giving up. We could also change the model but that would require some changes that we probably don't want to do right now.

Our current solution is just to query multiple LLMs. We are currently querying:

GPT4o
Claude3.5

We could also consider other models like oai-gpt-3.5-turbo (TPM: 80k), oai-gpt-4 (TPM: 20k) or oai-gpt-4-turbo (TPM: 60k) which others are less likely to use. Note that oai-gpt-4o has TPM: 300k and claude-3.5-sonnet (TPM: 80k). I would probably avoid using 3.5-turbo since I think the quality of results is worse but we could consider using oai-gpt-4-turbo as another LLM that we can query.

If you have: llm = LLM.from_settings(diagnosis.project.settings)

Then you just need to do: llm.model = "oai-gpt-4-turbo"

I can write a function to set the model and check if the string matches of the acceptable ones so that we avoid typos or setting a non existent model.

ChrisTimperley / RepairChain

Manage backup LLMs if one is rate limited or otherwise failing. #42