Open teknium1 opened 4 months ago
I ran the instruct humaneval benchmark and the eval results.json shows this:
"eos": "<|endoftext|>",
whereas the actual EOS should be <|im_end|> - don't really see why this is there
I recommend just running humanevalsynthesize from humanevalpack which offers the same + more - instructions for running are here: https://github.com/bigcode-project/octopack?tab=readme-ov-file#run & here https://github.com/bigcode-project/bigcode-evaluation-harness/blob/main/docs/README.md#humanevalpack
you may just need to add your instruction format here: https://github.com/bigcode-project/bigcode-evaluation-harness/blob/0f3e95f0806e78a4f432056cdb1be93604a51d69/bigcode_eval/tasks/humanevalpack.py#L235
Hey all, what is the n_samples for instruct human eval about?
The docs say 200 as if its a static setting that should be used, but I can't understand why.
Also, when the turns format is structured, does this look right for chatml?
--instruction_tokens "<|im_start|>user\n","<|im_end|>\n","<|im_start|>assistant\n"
without quoting each string it gave an error so I assume this is how to use this arg?