Bug list and selection criteria for the benchmark

ASSERT-KTH / ITER

ITER: Iterative Neural Repair for Multi-Location Patches, ICSE 2024, http://arxiv.org/pdf/2304.12015

http://arxiv.org/pdf/2304.12015

3 stars 2 forks source link

Bug list and selection criteria for the benchmark #2

Closed seongjoonh closed 2 months ago

seongjoonh commented 3 months ago

Hello, I'm a researcher in the APR field. Thank you for sharing this nice replication package! It has been really helpful for my research.

I am leaving this issue to request the exact list of bugs and the selection criteria used in the experiment. It appears that the experiment (Table 2) was conducted on a subset of 835 bugs from Defects4J 2.0 (e.g., projects Gson and Collections were excluded, and only some bugs from the Closure project were selected as benchmarks). Could you please provide the list of the 476 bugs used as benchmarks and explain the selection criteria?

SophieHYe commented 2 months ago

Dear @seongjoonh,

Thanks a lot for your interest. There are 10/17 Defects4J projects are considered for evaluation. The major reason is Gzoltar fault localization cannot be executed on all the projects due to versions and dependencies. For Closure project, Gzoltar cannot successfully produce the ranking list for all bugs, so some of the bugs are not evaluated, you may find scripts run_gzoltar_fl_Closure.sh for bug id 1-90 and run_gzoltar_fl_Closure_Bug90+.sh for bug number over 90.

seongjoonh commented 2 months ago

@SophieHYe I appreciate your explanation. I wasn't aware of this issue related to Closure. Could you please point me to the 10 projects you mentioned? Or, would you be willing to share the FL data from your experiment? This would greatly facilitate a fair comparison between techniques.

I'm closing the issue and thanks a lot again!