Open markict123 opened 7 months ago
Hi, Thanks a lot for your issue! Regarding your first question, you are right and we should only include response part of the mask in the denominator. For your second issue, we chose the best result under different temperature configurations for each metric, i.e., pass@1, pass@5, pass@10.
Thanks for nice work! I have two questions. The first one is about length norm in calculating the conditional log probability. According to the paper and common practice, the denominator should be the length of response. However, according to the code: https://github.com/hkust-zhiyao/RTL-Coder/blob/3394cce416fb0d70f76d81f809be5d0c32de0c55/train/mle_scoring.py#L199 the denominator seems to include the padding part. Could you please check it?
The second question I wonder is the proper way to show experiment results. The paper says, Do you mean choosing the best result under each temperature , or choose the best temperature according to Pass@1 or something? Thank you for reply.