Open GCVulnerability opened 3 months ago
Thanks for the question, we will release that soon!
Thanks for the question, we will release that soon!
Could you please give an approximate release time? Otherwise, we will consider implementing our own evaluation code. But this may lead to differences in our results.
Thanks for your work.
Also, may I ask if you calculate recall value based on the final generated patch and the ground truth patch? (without considering the code retrieved during the intermediate process before generating the patch)
Could you please give an approximate release time?
Our hope is sometime this week or early next week
calculate recall value
Not totally sure what you mean, can you please elaborate a bit more?
By recall value (used in SWE-bench paper), I want to mean "% Correct Location" in your paper. But after reading your paper carefully, now I think the two concepts are different.
Recall value measures the performance of RAG in SWE-bench paper. I am confused by the meaning of "% Correct Location", which encourages more code changes (to cover ground truth patch)?
right so in our paper "% Correct Location" measure the percentage of time the patch edits the location as the groundtruth developer patch. We count it as the correct location if the patch edits a superset of all the locations. For example if its the function granularity, if a patch edits func1 and func2 but the groundtruth patch edits only func1 we still count it as correct. You can see Section 3 in the paper for more detail
Thanks for your explanation! I have got it.
Any updates for the eval of fault location accuracy?
Hi, Agentless is an amazing work. I notice that '% Correct Location' is mentioned in the paper. I'm really interested in the Fault Location of SWE-bench. So can you please provide the ground truth of SWE-bench lite and the evaluation code?