salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.68k stars 394 forks source link

Reproducing translation results using the released finetuned checkpoint #69

Closed yiqingxyq closed 1 year ago

yiqingxyq commented 1 year ago

Hi! I evaluated your finetuned checkpoint on java-cs translation but could not get the exactly same results as your paper reported. I got 83.89/64.7 but the paper reported 84.03/65.9. I read that you use beam-search w/o sampling to generate the results, which should not bring randomness, so I'm wondering where did the randomness come from.

This is my output:

image

I downloaded the checkpoint from here: (and I used translate_java_cs_codet5_base.bin)

image

Thank you!

yuewang-cuhk commented 1 year ago

Hi @Veronicium

Thanks for pointing this out! Sorry that we have been busy for quite a while and just get time to reproduce this issue.

Yes, your reproduced results are correct and this checkpoint might be wrongly selected. We've replaced it with another newly finetuned checkpoint that gives bleu = 84.32, em = 65.9.