instructlab / eval

Python library for Evaluation
Apache License 2.0
5 stars 15 forks source link

`test_branch_gen_answers` does not fail when no model is being served #77

Closed khaledsulayman closed 2 weeks ago

khaledsulayman commented 1 month ago

In trying to work on the eval CI I noticed that the library doesn't raise any errors when there is no model being hosted at the requested port.

As I understand it, our current OpenAI error handling prints exceptions to stdout instead of failing because of some expected behavior with the API where it may be ok to fail temporarily while we keep retrying. However, currently, we are catching openai.OpenAIError, which is more general than open.APIConnectionError, which is what we are actually seeing in this case (no model being served).

I believe we can fix this one of two ways:

  1. keep the general except clause and if the error is open.APIConnectionError, we raise it. We could do this either immediately or after the max retries are completed.
  2. if applicable, we could just specify the error we're seeing that is requiring us to catch it and retry a few times. If I remember correctly this is a rate limiting issue so I'd imagine there is a separate native openai exception type for this scenario but would need to do more digging to confirm if that's the only scenario in which we'd want to carry out this retry functionality.
khaledsulayman commented 1 month ago

Here are the OpenAI library error types: https://help.openai.com/en/articles/6897213-openai-library-error-types-guidance

It seems like we may be able to specifically use APIError, Timeout, RateLimitError, etc. to match the specific cases where we'd want to wait and retry, but might be worth further discussing which of these are relevant in this case.