RAISEDAL / RAISEReadingList

This repository contains a reading list of Software Engineering papers and articles!
0 stars 0 forks source link

Paper Review: An Empirical Study on Noisy Label Learning for Program Understanding #85

Open mehilshah opened 1 month ago

mehilshah commented 1 month ago

Publisher

ICSE

Link to The Paper

https://dl.acm.org/doi/abs/10.1145/3597503.3639217

Name of The Authors

Wang, Wenhan, Yanzhou Li, Anran Li, Jian Zhang, Wei Ma, and Yang Liu

Year of Publication

2024

Summary

This paper conducts a comprehensive empirical study on how noisy labels impact deep learning models for program understanding tasks, and evaluates the effectiveness of various noisy label learning (NLL) approaches in improving model robustness and detecting mislabeled samples.

The study covers three different program understanding tasks:

  1. Program classification (classifying programs into categories)
  2. Vulnerability detection (classifying code as vulnerable or not)
  3. Code summarization (generating natural language summary for code)

For the program classification task, the authors inject two types of synthetic label noise (random and flip) into a clean dataset and study the impact on model performance both with and without NLL approaches.

For vulnerability detection and code summarization, they evaluate NLL on datasets that contain real-world label noise. The study includes evaluations on both small trained-from-scratch neural networks as well as large pre-trained transformer models frequently used in software engineering.

Key Findings:

Contributions of The Paper

Comments

Very important for our work