Open briviere opened 2 years ago
I tried running:
evaluate_functional_correctness ./data/example_samples.jsonl
Getting the following error:
File "/Users/brianriviere/projects/human-eval/human_eval/evaluation.py", line 65, in evaluate_functional_correctness args = (problems[task_id], completion, timeout, completion_id[task_id]) KeyError: 'test/0'
Is there something I'm not doing correctly?
What about "evaluate_functional_correctness data/example_samples.jsonl --problem_file=data/example_problem.jsonl"?
EOFError
thanks Henry, your reply is really helpful
I tried running:
evaluate_functional_correctness ./data/example_samples.jsonl
Getting the following error:
Is there something I'm not doing correctly?