Closed sx-liu closed 5 months ago
For flip_label
:
The function is overall OK. Please implement it under dattri/datasets/utils.py
. Please make label
's specification clearer (e.g., what should the shape of tensor be like). Please change label_range
to label_space
and state the format of this parameter more clearly.
For evaluate_auc
:
noise_detection_auc
.IF_scores
to score
, there are many data attribution methods other than IF.score
should be self-attribution score.Please revise the issue again and AT me for review before you implement anything.
@TheaperDeng Please review the updated issue. Thanks!
What exactly does noise_detection_auc
return? I suggest maybe it should be better if you could make it a tuple.
"""
Return (Tuple[float, Tuple[float, ...]]):
A tuple with 2 items. The first is the AUROC (or generally speaking, the AUC),
the second is a Tuple with `fpr, tpr, thresholds` just like
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html.
"""
Acturally, when you implement this function, do refer to the source code of sklearn (we would better not depend on sklearn). https://github.com/scikit-learn/scikit-learn/blob/f07e0138b/sklearn/metrics/_ranking.py#L1016-L1154
Others LGTM
Minor suggestion: noise_detection_auc -> mislabel_detection_auc?
Background
The benchmark needs a method to evaluate the effectiveness or precision of data attribution methods. One way to do so is to calculate the AUC (area under curve). The AUC measures the probability that a score randomly selected from a class of mislabeled data is greater than that of a class of clean data. To get an estimation of AUC, we can first manually introduce noises to our dataset by flipping the labels.
API Design
Demonstration
With pytorch datasets, one can randomly flip the labels like this,
With the calculated influence scores and the noise index, one can evaluate auc like this,
TODO