monarch-initiative / curate-gpt

LLM-driven curation assist tool (pre-alpha)
https://monarch-initiative.github.io/curate-gpt/
BSD 3-Clause "New" or "Revised" License
49 stars 11 forks source link

Bypass OpenAI server overload #11

Closed iQuxLE closed 5 months ago

iQuxLE commented 7 months ago

When loading ontologies into CurateGPT the insertion of the data into chromaDB is very often interrupted because of a server overload on the API side.

openai.error.ServiceUnavailableError: The server is overloaded or not ready yet.

Implementing the exponential_backoff_request helped me to bypass this by trying again with an additional small sleep every time it would fail. Its not a fancy solution but it can get the job done.

Additionally, when playing around for me it helped adding less sleep (from 60 to 10s) if the doc length reaches more than 3000000 chars.

Also a batch size of 1000 worked better than a batch size of 100 for me. Having smaller batch sizes than 100 mostly ended in this: requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

This was all pretty experimental and it might also depend on other factors.

@cmungall @justaddcoffee

justaddcoffee commented 7 months ago

+1 I ran into this issue that @iQuxLE mentions above:

openai.error.ServiceUnavailableError: The server is overloaded or not ready yet.

a fix here would help me a lot, as I can't seem to get make all to complete b/c of this issue

cmungall commented 6 months ago

Hi Carlo @iQuxLE!

It looks like the .idea changes are still a part of the diff - ideally a PR only has a single concern, see the monarch/bbop best practice doc https://berkeleybop.github.io/best_practice

If it's too much of a hassle to change we can merge and then delete later but I prefer to keep the history clear

justaddcoffee commented 6 months ago

@iQuxLE could you remove the .idea/ per @cmungall's comment above?

Also on this ticket, @iQuxLE is observing a different error from OpenAI now when retrieving LLM embeddings - a 500 server error. @iQuxLE possibly we could catch this error too when we are doing the exponential backoff

iQuxLE commented 5 months ago

Hi @justaddcoffee and @cmungall,

It seemed to be fairly complicated. I thought I figured it out, but I think I broke something