Closed YHX-X closed 2 years ago
The reference file and the prediction file should be the same format. Like shown in here, it is in plain text format with each line containing one example (in this case, one method).
The reference file and the prediction file should be the same format. Like shown in here, it is in plain text format with each line containing one example (in this case, one method).
Thank you for your reply. I have another question: I calculated the CodeBLEU of two semantically similar pieces of code that in BigCloneBench,but the results of CodeBLEU were poor(such as 24.1%, 14.7%). The params I set was 0.1,0.1,0.4,0.4. The scores of the syntax_match and the dataflow_match were below 50%. Could you please tell me why I get the poor results?
I think the reason probably is: CodeBLEU is the metric to measure how similar between model predictions and ground truths in ngram level, syntax level and dataflow. It is not used to measure two different approaches for the same functionality. The latter is related to the clone detection task, which is what BigCloneBench target at. CodeBLEU obviously cannot solve such problem since to understand the semantics of codes requires powerful neural networks.
hi , I want to calculate the CodeBLEU. Could you please tell me how to generate the preference.txt and the candidate.txt? Thanks a lot.