Closed ganler closed 7 months ago
This pull request maintains bigcode-evaluation-harness to use MBPP+ v0.2.0, further dropping several ill-formed tasks (e.g., the original test lists are wrong). The number of tasks changed from 399 to 378.
bigcode-evaluation-harness
cc: @loubnabnl
This pull request maintains
bigcode-evaluation-harness
to use MBPP+ v0.2.0, further dropping several ill-formed tasks (e.g., the original test lists are wrong). The number of tasks changed from 399 to 378.cc: @loubnabnl