Open geajack opened 5 months ago
wow, thanks. yeah, we should make a decision on how to fix this. I'm going to guess that this affects other typed languages too.
have you see HumanEval+ btw? Does that address this?
No, I haven't looked into Eval+
The original Python problem barely makes sense in a typed language such as Rust:
https://github.com/nuprl/MultiPL-E/blob/main/datasets/originals/HumanEval_92_any_int.py
It's not clear to me if this should be fixed by changing the problem, removing the problem from MultiPL-E, or just left as something that fails.
I would read the problem as "the number is an integer" rather than "the type of variable is integer", i.e., I'd expect
assert candidate(3.0,4,7)==True
That, however, would mean the problem doesn't match the original HumanEval, so maybe it's better to just drop it.
The Rust version of HumanEval/092 contains the following lines:
(I think this is row 67 of the huggingface dataset for multipl-E, but I haven't checked)
This obviously makes the tests unsatisfiable. It seems like this was a type-casting issue when translating from Python, the original tests read: