NL2Code / CodeR

152 stars 17 forks source link

Question about manager plans and related-issue-retrieval action #2

Closed yuntongzhang closed 5 months ago

yuntongzhang commented 5 months ago

Congratulations on the release and thank you for citing the AutoCodeRover paper!

The paper was a great read. After going through it, I have a couple of clarification questions:

  1. The Plan D part above Figure 3 mentions that "Plan D takes a test-driven approach with a ground truth test for issues (such as 'fail-to-pass' and 'pass-to-pass' tests in SWE-bench)." Does this mean the developer-written tests for the issue (i.e. the test_patch field in SWE-bench instances) were provided to CodeR?

  2. Section 2 mentions that "Action 18 retrieves the top-1 similar issue and its corresponding patch by description." (action 18 is related issue retrieval from Table 1). This is an interesting approach! I'm curious how you defined "similarity" between issues - was this using a RAG-based approach on the issue descriptions? Besides, how did you construct the corpus of issues to retrieve from?

Thank you very much in advance for your time and assistance!

NL2Code commented 5 months ago

Thank you for the questions!

  1. For CodeR experiments, we do not use Plan B and Plan D. Note that Plan D is suitable to be used in real production deployment as users could provide ground truth test cases for their own raised issues. In our paper, we just metioned Plan B and Pland D for more possiblities. We will clarify this in the next version of arXiv.

  2. For RAG, we use title+description to retrieve similar issues according to the similarity of embedding. We created and maintained an issue database (crawled all issues) for the 12 resopitories involved in SWE-bench lite. When returning the top 1, we filtered the issues with PR whose time stamp are later than current issue. Unfortanately, we found that retrieved issues do not help much in our results. In future, it is insteresting to see how similar issues could help more. The major contributions of CodeR are from multi-agent, task graph and fault localization.

Thanks again!

yuntongzhang commented 5 months ago

Thank you so much for your detailed clarification! It would be very interesting to see how past issues can help in the future.