Locus leverages the software changes to enable more accurate and fine-grained bug
localization. Specifically, it extracts the change logs and change hunks from the commit
information and use them as an alternative of segments of source code files. A file that is modified at one or more places, each of them is called hunk/delta.
Contributions of The Paper
Bug localization using software commit logs and change histories. The hypothesis that change logs contains substantial information compared to source file is verified by the authors using Mann-Whitney U-Test, that verifies that cosine similarities from logs is higher than source file.
Locus uses bug report and commit changes before the bug was reported as Input.
Locus collects the entities of change codes. Based on that they create two types of corpus, NL (natural language) and CE (code entity).
NL corpus is created from summary and description of bug reports. CE corpus is created from hunks and bug reports. Then entity names are extracted from the corpus using heuristics.
From NL and CE corpus, similarity scores are calculated.
Then boosting score is calculated based on bug fixing commit (like BRTracer). Then the NL, CE, Boosting scores are combined to create a combined score to rank the buggy files. Three determining parameters are involved (alpha, beta 1 & beta 2) in this experiment.
Publisher
ASE
Link to The Paper
https://dl.acm.org/doi/10.1145/2970276.2970359
Name of The Authors
Ming Wen , Rongxin Wu , Shing-Chi Cheung
Year of Publication
2016
Summary
Locus leverages the software changes to enable more accurate and fine-grained bug localization. Specifically, it extracts the change logs and change hunks from the commit information and use them as an alternative of segments of source code files. A file that is modified at one or more places, each of them is called hunk/delta.
Contributions of The Paper
Bug localization using software commit logs and change histories. The hypothesis that change logs contains substantial information compared to source file is verified by the authors using Mann-Whitney U-Test, that verifies that cosine similarities from logs is higher than source file. Locus uses bug report and commit changes before the bug was reported as Input. Locus collects the entities of change codes. Based on that they create two types of corpus, NL (natural language) and CE (code entity). NL corpus is created from summary and description of bug reports. CE corpus is created from hunks and bug reports. Then entity names are extracted from the corpus using heuristics. From NL and CE corpus, similarity scores are calculated. Then boosting score is calculated based on bug fixing commit (like BRTracer). Then the NL, CE, Boosting scores are combined to create a combined score to rank the buggy files. Three determining parameters are involved (alpha, beta 1 & beta 2) in this experiment.
Comments
No response