bigcode-project / octopack

🐙 OctoPack: Instruction Tuning Code Large Language Models
https://arxiv.org/abs/2308.07124
MIT License
431 stars 27 forks source link

656 total bugs or 984 total bugs? #11

Open ahong007007 opened 1 year ago

ahong007007 commented 1 year ago

When running code repair, the actual question is 656 questions, but the paper describes

“We manually add a bug to each of the 164 HumanEval solutions across all 6 languages ​​(984 total bugs). ”

May I ask which one is correct? The problem is also in the iterative increase?

image

SivilTaram commented 1 year ago

@ahong007007 From your log, you only select the task humanevalfixtests-python. It should contain 164 buggy questions. Would like to answer more if you have more context for the 656 number.

Muennighoff commented 1 year ago

Like @SivilTaram said, the Python split has 164 bugs. As there are 6 languages, there's 164 * 6 = 984 in total.

The 656 you see is the number of iterations. It is calculated as: 164 n_samples/batch_size i.e. 164 4 = 656.