RAISEDAL / RAISEReadingList

This repository contains a reading list of Software Engineering papers and articles!
0 stars 0 forks source link

Paper Review: Confident learning: Estimating uncertainty in dataset labels #83

Open mehilshah opened 1 month ago

mehilshah commented 1 month ago

Publisher

JAIR (Journal of AI Research)

Link to The Paper

https://www.jair.org/index.php/jair/article/view/12125

Name of The Authors

Curtis Northcutt, Lu Jiang, and Isaac Chuang

Year of Publication

2021

Summary

The paper proposes a confident learning (CL) framework for estimating uncertainty in noisy labels and finding dataset errors. CL focuses on characterizing and cleaning label noise rather than modifying the model architecture. CL is based on three principles: pruning noisy examples, counting examples using probabilistic thresholds, and ranking examples during training. CL can estimate the distribution between noisy (observed) and true (latent) labels.

Contributions of The Paper

Comments

It's an interesting technique, already used for one of the baselines, though! It could be modified further with more architecture/task-specific/representation-specific subtleties for fault localization.