Open hmellor opened 5 months ago
@loubnabnl, if you have time I'd appreciate a review, thanks!
Seems like there is an issue with chat
format:
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "'messages' is a required property", 'type': 'invalid_request_error', 'param': None, 'code': None}}
0%| | 0/164 [00:02<?, ?it/s]
Task exception was never retrieved
future: <Task finished name='Task-4' coro=<tqdm_asyncio.gather.<locals>.wrap_awaitable() done, defined at /opt/homebrew/anaconda3/envs/7diamond/lib/python3.10/site-packages/tqdm/asyncio.py:75> exception=BadRequestError('Error code: 400 - {\'error\': {\'message\': "\'messages\' is a required property", \'type\': \'invalid_request_error\', \'param\': None, \'code\': None}}')>
Traceback (most recent call last):
File "/opt/homebrew/anaconda3/envs/env_name/lib/python3.10/site-packages/tqdm/asyncio.py", line 76, in wrap_awaitable
return i, await f
File "/opt/homebrew/anaconda3/envs/env_name/lib/python3.10/site-packages/openai/resources/completions.py", line 1020, in create
return await self._post(
File "/opt/homebrew/anaconda3/envs/env_name/lib/python3.10/site-packages/openai/_base_client.py", line 1705, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
File "/opt/homebrew/anaconda3/envs/env_name/lib/python3.10/site-packages/openai/_base_client.py", line 1408, in request
return await self._request(
File "/opt/homebrew/anaconda3/envs/7diamond/lib/python3.10/site-packages/openai/_base_client.py", line 1499, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "'messages' is a required property", 'type': 'invalid_request_error', 'param': None, 'code': None}}
@tshrjn you're going to need to provide more context, the word chat
doesn't feature in my PR at all.
In the PR description I explicitly state that I am not using the chat
endpoint, so I don't know what you did to get a chat error.
I tested this branch and it worked perfectly fine. Only caveat, it really only works with completion models (i.e. babbage, davinci at OpenAI) and not with chat models! But this is expected due to the format of the benchmark.
Solves #161 and #148 and is an alternative to #179.
Employs the DRY principle by only changing the creation of the
Evaluator
class inmain.py
andgeneration.parallel_generations
function. Therefore, won't need to maintain multipleEvaluator
classes in parallel.Using the
completions
instead ofchat.completions
was a design choice because it eliminates errors/confusion from additional chat templating taking place behind the API.If you want to evaluate a model running behind an OpenAI compatible API, then you can use
base_url
to send any generation requests to that URL.base_url
to the url you are hosting with (i.e.http://localhost:8000/v1
).model
to the served name of your model.OPENAI_API_KEY
.base_url
tohttps://api.openai.com/v1
.model
to the name of the OpenAI model you want to use (e.g.gpt-3.5-turbo-1106
).