Paper Review: On the Relationship between Bug Reports and Queries for Text Retrieval-based Bug Localization

Publisher

EMSE

Link to The Paper

https://link.springer.com/article/10.1007/s10664-020-09823-w

Name of The Authors

Chris Mills, Esteban Parra, Jevgenija Pantiuchina, Gabriele Bavota, Sonia Haiduc

Year of Publication

2020

Summary

Text Retrieval has been one of the most common approaches in bug localization. The system code is indexed into a search space and that is then queried for code relevant to a bug report. But many studies have shown that TR based approaches lack sufficient controls on biases that artificially inflate the results - misclassified bugs, tangled commits, and localization hints. In this paper, the authors argue that the contemporary evaluations of TR also include a negative bias that outweigh the positive bias - TR approaches expect natural language query, most evaluations simply formulate this query as the full text of a bug report. In this study they show that highly performing queries can be extracted from the bug report text in order to make TR effective. They further analyze the provenance of the terms in these highly performing queries that will help in automatic query extraction from bug reports.

Contributions of The Paper

In this paper, they present an empirical study , providing evidence on the true potential of TR approaches and the significant impact that optimizing queries have on their effectiveness. They optimize the query using only the words present in the bug reports even when the localization hints such as program entity names, test cases and stack traces are not present. They find the most effective query that can be extracted from the bug report and evaluating the performance on that query rather than the default query composed of the entire bug title and description. They devised a genetic algorithm for getting optimal query obtained from bug report vocabulary knowing a priori ground truth. Their key contributions are :

They investigated a single fitness function along with average precision, recall to drive their genetic algorithm through the formulation of near optimal queries which represent the query's ability to find all the relevant document.
They investigated how much a particular search engine impacts the results , by using two different versions of Lucene search engine. They employed different ranking functions to measure the impact of the results and the possibility to formulate high quality queries.
They try to understand how near optimal queries are derived from a bug report and provide insights for TR query reformulation techniques.
They increase the generalizability of the results by expanding the set of queries in their dataset

Their Key Findings :

Evaluations should focus more on formulating queries than merely using bug report title or description, or a concatenation of the two.
Bug localization exhibit poorly when using full text in a bug report query, especially in the case when localization hints are not present.
A near optimal query can be obtained from a bug report even if localization hints are removed. The provenance of the terms show that while there are intuitive ways humans might look for terms to retain to in a near optimal query like OB, EB and S2R, the optimization performed by genetic algorithms makes drastic different choices

Comments

No response

RAISEDAL / RAISEReadingList