RAISEDAL / RAISEReadingList

This repository contains a reading list of Software Engineering papers and articles!
0 stars 0 forks source link

Paper Review: On the Relationship between Bug Reports and Queries for Text Retrieval-based Bug Localization #67

Open usmimukherjee opened 1 year ago

usmimukherjee commented 1 year ago

Publisher

EMSE

Link to The Paper

https://link.springer.com/article/10.1007/s10664-020-09823-w

Name of The Authors

Chris Mills, Esteban Parra, Jevgenija Pantiuchina, Gabriele Bavota, Sonia Haiduc

Year of Publication

2020

Summary

Text Retrieval has been one of the most common approaches in bug localization. The system code is indexed into a search space and that is then queried for code relevant to a bug report. But many studies have shown that TR based approaches lack sufficient controls on biases that artificially inflate the results - misclassified bugs, tangled commits, and localization hints. In this paper, the authors argue that the contemporary evaluations of TR also include a negative bias that outweigh the positive bias - TR approaches expect natural language query, most evaluations simply formulate this query as the full text of a bug report. In this study they show that highly performing queries can be extracted from the bug report text in order to make TR effective. They further analyze the provenance of the terms in these highly performing queries that will help in automatic query extraction from bug reports.

Contributions of The Paper

In this paper, they present an empirical study , providing evidence on the true potential of TR approaches and the significant impact that optimizing queries have on their effectiveness. They optimize the query using only the words present in the bug reports even when the localization hints such as program entity names, test cases and stack traces are not present. They find the most effective query that can be extracted from the bug report and evaluating the performance on that query rather than the default query composed of the entire bug title and description. They devised a genetic algorithm for getting optimal query obtained from bug report vocabulary knowing a priori ground truth. Their key contributions are :

Their Key Findings :

Comments

No response