openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Other
14.97k stars 2.6k forks source link

Cannot pass the check and got this error: KeyError: 'sample' #1012

Open 14H034160212 opened 1 year ago

14H034160212 commented 1 year ago

Describe the bug

Hi,

I got the following error when my PR is checked. Here is the link for my PR request. Does anyone know what is happening here? It seems the error is not caused by my yaml.

Processing evals/registry/evals/positive-binary-operations.yaml
Eval Name: positive-binary-operations
[2023-05-22 15:35:02,481] [registry.py:249] Loading registry from /home/runner/work/evals/evals/evals/registry/evals
[2023-05-22 15:35:02,623] [registry.py:249] Loading registry from /home/runner/.evals/evals
[2023-05-22 15:35:02,624] [oaieval.py:110] Run started: 230522153502ICZXYBZ5
[2023-05-22 15:35:02,624] [data.py:75] Fetching positive-binary-operations/fewshot.jsonl
[2023-05-22 15:35:02,625] [data.py:75] Fetching positive-binary-operations/samples.jsonl
[2023-05-22 15:35:02,678] [eval.py:34] Evaluating 10 samples
[2023-05-22 15:35:02,683] [eval.py:153] Running in threaded mode with 10 threads!

  0%|          | 0/10 [00:00<?, ?it/s]
  0%|          | 0/10 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.9.16/x64/bin/oaieval", line 8, in <module>
    sys.exit(main())
  File "/home/runner/work/evals/evals/evals/cli/oaieval.py", line 164, in main
    run(args)
  File "/home/runner/work/evals/evals/evals/cli/oaieval.py", line 141, in run
    result = eval.run(recorder)
  File "/home/runner/work/evals/evals/evals/elsuite/basic/match.py", line 53, in run
    self.eval_all_samples(recorder, samples)
  File "/home/runner/work/evals/evals/evals/eval.py", line 155, in eval_all_samples
    idx_and_result = list(tqdm(iter, total=len(work_items), disable=not show_progress))
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/multiprocessing/pool.py", line 870, in next
    raise value
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/runner/work/evals/evals/evals/eval.py", line 143, in worker_thread
    result = future.result(timeout=timeout)
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/concurrent/futures/_base.py", line [391](https://github.com/openai/evals/actions/runs/4782964764/jobs/9054897613#step:7:392), in __get_result
    raise self._exception
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/runner/work/evals/evals/evals/eval.py", line 133, in eval_sample
    return idx, self.eval_sample(sample, rng)
  File "/home/runner/work/evals/evals/evals/elsuite/basic/match.py", line 36, in eval_sample
    prompt += s["sample"]
KeyError: 'sample'
Error: Process completed with exit code 1.

I also got a red cycle on the my yaml file, does anyone know what is that? There is no explanation for that signal. Thanks for any advice and help in advance. image

To Reproduce

You may need to replicate the error when you recheck the PR request.

Code snippets

No response

OS

Linux

Python version

Python 3.9

Library version

openai-evals 0.1.1

pan93412 commented 1 year ago

The red cycle means that there is no trailing newline in the file.