Paper Review: Version History, Similar Report, and Structure: Putting Them Together for Improved Bug Localization

Publisher

ICPC

Link to The Paper

https://dl.acm.org/doi/abs/10.1145/2597008.2597148

Name of The Authors

Shaowei Wang , David Lo

Year of Publication

2014

Summary

AmaLgam integrates a version history, similar bug reports and structure for bug localization. It is suggested that the change history of source code files provides vital information for locating bugs.

Contributions of The Paper

Key contribution: Bug Report: Bug ID (can be used as a reference number to identify commits in version control system that fix it) , date when a bug report was submitted, summary & description Preprocess: Removing punctuation, tokenization, identifier splitting based on Camel case splitting, source code are converted into Abstract Syntax Tree (AST) before identifying the identifiers, Removing stop words and lastly stemming Uses google’s bug prediction formula which takes the effect of change burst into consideration.Generally the source files which were modified recently or frequently are more suspicious regarding the new-coming bugs (bug prediction technique). Consider only recent version control history to computer probability instead of complete VCH (threshold k days = 15 days here) Assigns weights that govern the contribution of the probability of a file to be buggy (computed by bug prediction technique) and the similarity score of a bug report to a file (compute by integrating BugLocator and BLUiR) Future work: Integrating other bug prediction techniques, different ways to combine three scores and using PCA to analyze which component contributes to the most for the final score.

Comments

Dataset: 3000 bug reports

RAISEDAL / RAISEReadingList