microsoft / CodeT

MIT License
599 stars 76 forks source link

CodeGen-16b MBPP pass@1 results #14

Closed henryhungle closed 5 months ago

henryhungle commented 1 year ago

Hi,

Thanks for the great work and releasing the data/code.

I tried to replicate the results using the generated codes/test cases of CodeGen-16b on the MBPP benchmark. My pass@10/100 are quite similar to your reported results but pass@1 is quite far behind when using CodeGen-16b only (the result when using CodeT is similar to the reported number).

Model pass@1
CodeGen-16b reported 42.4%
CodeGen-16b replicate 31.32%
CodeGen-16b + CodeT reported 49.5%
CodeGen-16b + CodeT replicate 49.58%

I wonder whether you used a different generation setting for pass@1 (e.g. different. temperatures) than for pass@10/100? Or there were a typo in the reported number.

Boyu-Mi commented 10 months ago

I got this problem, too. But I found the author mentioned that they used greedy search for pass@1 rather than 0.8 temperature in Appendix A.