I notice that there are many test files for one PL in the codesearch task for CodeBERT and I am wondering how do you get the experiment results: for one PL, do you evaluate the fine-tuned model on each test file and calculate the average mrr value for this PL in codesearch task?
I notice that there are many test files for one PL in the codesearch task for CodeBERT and I am wondering how do you get the experiment results: for one PL, do you evaluate the fine-tuned model on each test file and calculate the average mrr value for this PL in codesearch task?