openai human-eval issues

openai / human-eval

Code for the paper "Evaluating Large Language Models Trained on Code"

MIT License

2.42k stars 345 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

I run the test program ，but why my “example_samples.jsonl_results.jsonl” file displayed “all passed” ？

#49 CHfaithwy opened 2 months ago
0
Error running evaluate_functional_correctness samples.json

#48 nextdoorUncleLiu opened 2 months ago
0
About how to use

#47 RIKI0913 opened 2 months ago
0
Fix k parameter handling

#46 LlamaEnjoyer closed 3 months ago
0
Evaluation doesn't work on Windows

#45 peter-ch opened 5 months ago
4
Tests in task 67 are impossible to satisfy

#44 geajack closed 6 months ago
0
Task 145 makes no sense

#43 geajack opened 6 months ago
0
Removed deprecated fields.

#42 Youniqueli opened 7 months ago
0
Error in tests for HumanEval/163

#41 mono-jiarui opened 8 months ago
0
Evaluations timing out

#40 antonkarlsson1 closed 9 months ago
0
When running the code generated by the model, an error occurs: failed: No module named 'scipy'

#39 HangXue-lab opened 9 months ago
0
fix: python3.10 upgrade and entry point change

#38 ayushkalani closed 9 months ago
0
HE vAL

#37 Youniqueli opened 9 months ago
1
why use ThreadPoolExecutor with GIL in background?

#36 johnmclain opened 10 months ago
1
bug in estimate_pass_at_k

#35 sidaw opened 10 months ago
0
Why do I use the phi model to output the same result for all samples at a temperature of 0.8?

#34 Mrzhang-dada opened 11 months ago
0
Where to find the leaderboard?

#33 zhimin-z opened 1 year ago
0
I do not understand how to run human eval

#32 teknium1 opened 1 year ago
1
fix Can't pickle local object 'check_correctness.<locals>.unsafe_execute

#30 Fazziekey opened 1 year ago
1
Fix 3

#29 marcusm117 closed 1 year ago
0
Fix 2

#28 marcusm117 closed 1 year ago
0
AttributeError: Can't pickle local object 'check_correctness.<locals>.unsafe_execute'

#27 tianzhaotju opened 1 year ago
6
# Realizar ajustes en la arquitectura

#26 D0yi closed 1 year ago
1
fix: reset `os.getcwd` system call after sandbox execution

#25 yaohui-wyh opened 1 year ago
0
file missing

#24 shuaiwang2022 opened 1 year ago
1
Fix Mistakes in the Dataset

#23 marcusm117 opened 1 year ago
1
Error in canonical solution 95 check_dict_case

#22 PootieT opened 1 year ago
0
Error in canonical solution and tests for HumanEval/163

#20 bmosaicml opened 1 year ago
1
Codex Training Data

#19 zachares opened 1 year ago
1
evaluate_functional_correctness can't run

#18 BoyuanJackChen closed 6 months ago
9
Finetuning With HumanEval

#17 MT010104 opened 2 years ago
0
Why pass@k =1.0? use the "evaluate_functional_correctness data/example_samples.jsonl --problem_file=data/example_problem.jsonl"

#16 Smithol opened 2 years ago
3
Why not allow contribution?

#15 rafidka opened 2 years ago
0
Install fails if entry point is not provided

#14 gaarutyunov opened 2 years ago
0
pass@k on filtered samples

#13 henryhungle opened 2 years ago
0
Entry point error while Installing the package.

#12 Ali1858 closed 2 years ago
1
Prompt used in APPS

#11 henryhungle closed 2 years ago
2
Evaluation.py failing on KeyError: 'test/0'

#10 briviere opened 2 years ago
3
Fix type signature of read_problems function

#9 Linyxus closed 2 years ago
1
Re-produce raw GPT-Neo with 125M and 1.3B on this human-eval dataset

#8 BitcoinNLPer closed 3 years ago
1
indentation error in execution file

#7 mrinal18 closed 3 years ago
1
Error in the prompt of HumanEval/47

#6 Kim-mins opened 3 years ago
1
Question about the generate_one_completion

#5 xupeng1910 opened 3 years ago
1
Problems with installation instructions

#4 jonpincus opened 3 years ago
4
execution.py bug request

#3 rainmaker712 closed 3 years ago
1
Add license file.

#2 qimingyuan closed 3 years ago
0
Will this be helpful for people reading the paper ?

#1 jalotra opened 3 years ago
2