ryl0427 / Code-for-OT-Filter

Code for OT-Filter: An Optimal Transport Filter for Learning with Noisy Labels
5 stars 1 forks source link

Hi, It seems that the function for generating asymmetric label noise in cifar100 is not correct? #1

Open LanXiaoPang613 opened 1 year ago

LanXiaoPang613 commented 1 year ago

Hi, It seems that the function for generating asymmetric label noise in cifar100 is not correct? For example, when a label exceeds 19, it cannot be found in the coarse_labels array. image I think the correct generating function is as follows: image def asymmetrical_cifar100(coarse_targets, label, noisy_rate): noisy_label = np.zeros(50000,) num_class = 100 coarse_labels = np.array([4, 1, 14, 8, 0, 6, 7, 7, 18, 3, 3, 14, 9, 18, 7, 11, 3, 9, 7, 11, 6, 11, 5, 10, 7, 6, 13, 15, 3, 15, 0, 11, 1, 10, 12, 14, 16, 9, 11, 5, 5, 19, 8, 8, 15, 13, 14, 17, 18, 10, 16, 4, 17, 4, 2, 0, 17, 4, 18, 17, 10, 3, 2, 12, 12, 16, 12, 1, 9, 19, 2, 10, 0, 1, 16, 12, 9, 13, 15, 13, 16, 19, 2, 4, 6, 19, 5, 5, 8, 19, 18, 1, 2, 15, 6, 0, 17, 8, 14, 13])

confusion_matrix_in = np.identity(num_class) * (1 - noisy_rate)

idxes = np.random.permutation(len(label))
targets = np.array(label)
num_subclasses = num_class // 20

for i in range(20):
    # embed()
    subclass_targets = np.unique(targets[coarse_targets == i])
    clean = subclass_targets
    noisy = np.concatenate([clean[1:], clean[:1]])
    for j in range(num_subclasses):
        confusion_matrix_in[clean[j], noisy[j]] = noisy_rate

for t in range(len(idxes)):
    current_label = targets[idxes[t]]
    conf_vec = confusion_matrix_in[current_label, :]
    label_sym = np.random.choice(np.arange(0, num_class), p=conf_vec.transpose())
    noisy_label[idxes[t]] = label_sym

ccc = np.array(noisy_label)
real_noise_rate = len(np.where(ccc==targets)[0])/50000
return noisy_label
ryl0427 commented 1 year ago

Thank you for your Issue. I'm sorry for the incorrect version uploaded due to my negligence. The correct one may be coarse_labels = coarse_labels[label[i]]

LanXiaoPang613 commented 1 year ago

Thank for your reply, but i meet another question that the test accuracy is only nearly 90% vs 95.1% (in the paper) when i reproduce the 40% asymmetric label noise in cifar-10 dataset. The setup of the hyperparameters is shown as follows,
image image Sorry to keep bothering you, thank you! Look forward to your reply.

ryl0427 commented 1 year ago

You can try lambda_u = 0

LanXiaoPang613 commented 1 year ago

thank you, i got it.