Closed tedvuminhhuy closed 9 months ago
I'm going to guess that this is an issue with the default maximum sequence length. We've hardcoded 512 new tokens, which should be adequate for a completion model:
But, if you're using a chattier instruct model, you probably want to increase it.
I think the bigcode-evaluation-harness let's you set the max new tokens from the CLI.
Thank you, @arjunguha!
Do you happen to know how many max_tokens Code Llama uses while benchmarking? I'm trying to compare my model with Code Llama, but their paper doesn't say anything about that.
I used model: Code Llama - Instruct 13B to generate code for MultiPL-E Typescript language
In some test cases: e.g.
HumanEval_109_move_one_ball
the end of the code generated not ending correctly like this:that cause the syntax error. I don't know if that comes from the model Code Llama or our tool MultiPL-E ?
can you guys please double check ? other test case like: HumanEval_112_reverse_delete also has the same error
cc @arjunguha