Closed PrashantDixit0 closed 1 month ago
I also encountered an error in the same item. I would like to perform an evaluation on only computer security items out of the 57 items of mmlu. However, the gpt-4o model is not supported yet.
!oaieval gpt-4o match_mmlu_computer_security
[2024-05-16 13:13:17,116] [registry.py:271] Loading registry from /Users/jaesik/ai/evals/evals/registry/evals
[2024-05-16 13:13:17,379] [registry.py:271] Loading registry from /Users/jaesik/.evals/evals
[2024-05-16 13:13:17,750] [oaieval.py:215] Run started: 240516041317M4NX4QQM
[2024-05-16 13:13:17,989] [data.py:94] Fetching /Users/jaesik/ai/evals/examples/../evals/registry/data/mmlu/computer_security/few_shot.jsonl
[2024-05-16 13:13:17,990] [data.py:94] Fetching /Users/jaesik/ai/evals/examples/../evals/registry/data/mmlu/computer_security/samples.jsonl
[2024-05-16 13:13:17,990] [eval.py:36] Evaluating 100 samples
[2024-05-16 13:13:18,002] [eval.py:144] Running in threaded mode with 10 threads!
0%| | 0/100 [00:01<?, ?it/s]
Traceback (most recent call last):
File "/Users/jaesik/miniconda3/bin/oaieval", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/jaesik/ai/evals/evals/cli/oaieval.py", line 304, in main
run(args)
File "/Users/jaesik/ai/evals/evals/cli/oaieval.py", line 226, in run
result = eval.run(recorder)
^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/ai/evals/evals/elsuite/basic/match.py", line 60, in run
self.eval_all_samples(recorder, samples)
File "/Users/jaesik/ai/evals/evals/eval.py", line 146, in eval_all_samples
idx_and_result = list(tqdm(iter, total=len(work_items), disable=not show_progress))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/miniconda3/lib/python3.11/site-packages/tqdm/std.py", line 1178, in __iter__
for obj in iterable:
File "/Users/jaesik/miniconda3/lib/python3.11/multiprocessing/pool.py", line 873, in next
raise value
File "/Users/jaesik/miniconda3/lib/python3.11/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/ai/evals/evals/eval.py", line 137, in eval_sample
return idx, self.eval_sample(sample, rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/ai/evals/evals/elsuite/basic/match.py", line 46, in eval_sample
result = self.completion_fn(
^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/ai/evals/evals/completion_fns/openai.py", line 118, in __call__
result = openai_completion_create_retrying(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/ai/evals/evals/completion_fns/openai.py", line 32, in openai_completion_create_retrying
result = create_retrying(
^^^^^^^^^^^^^^^^
File "/Users/jaesik/miniconda3/lib/python3.11/site-packages/backoff/_sync.py", line 48, in retry
ret = target(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/ai/evals/evals/utils/api_utils.py", line 20, in create_retrying
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/miniconda3/lib/python3.11/site-packages/openai/_utils/_utils.py", line 277, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/miniconda3/lib/python3.11/site-packages/openai/resources/completions.py", line 528, in create
return self._post(
^^^^^^^^^^^
File "/Users/jaesik/miniconda3/lib/python3.11/site-packages/openai/_base_client.py", line 1240, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jaesik/miniconda3/lib/python3.11/site-packages/openai/_base_client.py", line 921, in request
return self._request(
^^^^^^^^^^^^^^
File "/Users/jaesik/miniconda3/lib/python3.11/site-packages/openai/_base_client.py", line 1020, in _request
raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: Error code: 404 - {'error': {'message': 'This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?', 'type': 'invalid_request_error', 'param': 'model', 'code': None}}
I have created a pull request to add support #1530, the change is super simple, you can do it manually so you don't have to wait for the change to be merged.
Thank you @androettop for adding it :+1:
Describe the feature or improvement you're requesting
Please add support for GPT4-o for evaluation .
Additional context
No response