Closed Moshiii closed 4 years ago
Hi @Moshiii , Thank you for your interest in code2seq!
The 23.04 BLEU score is for the much smaller C# dataset, not for Java. In Java we measured only F1; in C# we measured only BLEU.
I hope it helps? Uri
Hi Urialon,
Thanks for replying! this solves the myth!
Just to share my bleu result on Java-small and Java-large:
the first bleu= 35.35 is from java-small dataset the second bleu = 30.38 is from java-large dataset.
Hi, Thanks for making your amazing work easy to reproduce first.
I am reproducing the model and I found the bleu score for java-large-test is 30.38. way better than the paper claimed 23.0 how do I reproduce the 23.04? am I doing something wrong here?
I used the common.sompute_bleu and configured the Perl script: but I get a better score:
Any hints on this, please?