Implementing six existing IR-based bug localization (IRBL) techniques on industrial project data (Huawei projects - 161,967 source codes and 24,437 bug reports).
Contributions of The Paper
Key findings are -
Datasets from Industrial projects have three unique characteristics which are rare in open-source projects: Software Product Line (SPL), mixtures of multiple natural languages, and poor-quality bug reports.
IRBL works well for small-scale projects rather than large-scale projects. BugLocator, BLUiR, and AmalGam all rely on the lexical similarity between bug reports and source code which is the major problem due to multiple perspectives and multilingualism problems in industrial projects. The solution is collaborative filtering which calculates the score of a source code file with its relevant historical bug reports (similar bug fix information -used in BugLocator).
Software Product Line (SPL) - refers to a set of products aggregate sharing a common, managed set of features and developed in a common project. An SPL project usually involves some similar software features which would interfere lexical similarity of IRBL. Similar bug reports may refer to different features which would interfere with the effectiveness of collaborative filtering as well. From the experimental result, the difference in the performance of identifying SPL and Non -SPL is visible but slight.
Mixtures of multiple natural languages - Lexical similarity does not work well for this issue; translation does not work well when both bug reports and source code contain non-English words. One solution might be query reformulation (Blizzard), but all these experiments are done in monolingual settings only.
Quality of bug reports: Two factors here - one is Multiple perspectives of bug reports which means bug reports are issued by different types of submitters (end user, beta tester, etc.). Submitters need to be more knowledgeable about source code and describe the bug using defective behaviors without providing any source code-related information. Second is noises in bug reports which means having little to no connection with the reports and corresponding modified source files. From the experimental result, the difference in the performance of removing noises from bug reports using some heuristic (based on keywords) and noisy bug reports are mixed. The problem lies in removing noises from bug reports.
Potential improvement in IRBL-
leveraging program analysis techniques (like BLUiR) but using more program information like control flow graphs and data-flow information
Utilizing cross-language IR based techniques
Improving the effectiveness of collaborative filtering feature (similar bug fix information) - using a threshold of similarity or the number of bug reports and calculating the similarity between bug reports using word embeddings to calculate semantic relationships (BERT)
Detecting noises intelligently- Using text classification models and clustering models to distinguish between clean bugs and noisy bugs in both supervised and unsupervised approaches.
Publisher
ESE
Link to The Paper
https://link.springer.com/article/10.1007/s10664-021-10082-6
Name of The Authors
Wei Li, Qingan Li, Yunlong Ming, Weijiao Dai, Shi Ying & Mengting Yuan
Year of Publication
2022
Summary
Implementing six existing IR-based bug localization (IRBL) techniques on industrial project data (Huawei projects - 161,967 source codes and 24,437 bug reports).
Contributions of The Paper
Key findings are - Datasets from Industrial projects have three unique characteristics which are rare in open-source projects: Software Product Line (SPL), mixtures of multiple natural languages, and poor-quality bug reports. IRBL works well for small-scale projects rather than large-scale projects. BugLocator, BLUiR, and AmalGam all rely on the lexical similarity between bug reports and source code which is the major problem due to multiple perspectives and multilingualism problems in industrial projects. The solution is collaborative filtering which calculates the score of a source code file with its relevant historical bug reports (similar bug fix information -used in BugLocator). Software Product Line (SPL) - refers to a set of products aggregate sharing a common, managed set of features and developed in a common project. An SPL project usually involves some similar software features which would interfere lexical similarity of IRBL. Similar bug reports may refer to different features which would interfere with the effectiveness of collaborative filtering as well. From the experimental result, the difference in the performance of identifying SPL and Non -SPL is visible but slight. Mixtures of multiple natural languages - Lexical similarity does not work well for this issue; translation does not work well when both bug reports and source code contain non-English words. One solution might be query reformulation (Blizzard), but all these experiments are done in monolingual settings only. Quality of bug reports: Two factors here - one is Multiple perspectives of bug reports which means bug reports are issued by different types of submitters (end user, beta tester, etc.). Submitters need to be more knowledgeable about source code and describe the bug using defective behaviors without providing any source code-related information. Second is noises in bug reports which means having little to no connection with the reports and corresponding modified source files. From the experimental result, the difference in the performance of removing noises from bug reports using some heuristic (based on keywords) and noisy bug reports are mixed. The problem lies in removing noises from bug reports. Potential improvement in IRBL- leveraging program analysis techniques (like BLUiR) but using more program information like control flow graphs and data-flow information Utilizing cross-language IR based techniques Improving the effectiveness of collaborative filtering feature (similar bug fix information) - using a threshold of similarity or the number of bug reports and calculating the similarity between bug reports using word embeddings to calculate semantic relationships (BERT) Detecting noises intelligently- Using text classification models and clustering models to distinguish between clean bugs and noisy bugs in both supervised and unsupervised approaches.
Comments
No response