Could you share the Code on mini-WebVision dataset?

ShikunLi commented 3 years ago

Hi，thanks for your interesting work! As the paper reported, MOIT performs much better than other SOTAs on mini-WebVision dataset. Could you share the code on mini-WebVision dataset? That's will help me a lot!

DiegoOrtego commented 3 years ago

Hi @shikunLi,

Many thanks for your interest in our work. We don't have the code in mini-WebVision cleaned. But, it should be straightforward to achieve our results by adapting the dataset from (for example) DivideMix repo, following the hyperparameters specified in the paper for this dataset (Table 1), and using the following data augmentation: transform_train = transforms.Compose([ transforms.Resize(256), transforms.RandomResizedCrop(224, scale = (0.2, 1)), transforms.RandomHorizontalFlip(p=0.5), transforms.RandomApply([transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)], p=0.8), transforms.RandomGrayscale(p=0.2), transforms.ToTensor(), transforms.Normalize(mean, std), ])

We'll try to get some time to clean the code, but I'm not sure if it will be soon.

ShikunLi commented 3 years ago

Thanks for your reply. I will try to achieve the results by following your suggestions.

ShikunLi commented 3 years ago

Hi @DiegoOrtego , I have another question about the setting of ELR results in your paper. I equip ELR with mixup like ELR+ (see the code below), but just achieves 65.60% test accuracy on CIFAR100 with 40% asymmetric label noise (71.25% reported in the paper). Could you offer the hyperparameters for the ELR in your experiments?

class elr_mixup_loss(nn.Module):
    def __init__(self, num_examp, num_classes=100, elr_lambda = 7, beta=0.9):
        super(elr_mix_loss, self).__init__()
        self.pred_hist = (torch.zeros(num_examp, num_classes)).cuda()
        self.q = 0
        self.beta = beta
        self.num_classes = num_classes
        self.elr_lambda = elr_lambda

    def forward(self, output, y_labeled):
        y_pred = F.softmax(output,dim=1)

        y_pred = torch.clamp(y_pred, 1e-4, 1.0-1e-4)

        ce_loss = torch.mean(-torch.sum(y_labeled * F.log_softmax(output, dim=1), dim = -1))
        reg = ((1-(self.q * y_pred).sum(dim=1)).log()).mean()
        final_loss = ce_loss + self.elr_lambda*reg

        return  final_loss

    def update_hist(self, epoch, out, index= None, mix_index = ..., mixup_l = 1):
        y_pred_ = F.softmax(out,dim=1)
        self.pred_hist[index] = self.beta * self.pred_hist[index] +  (1-self.beta) *  y_pred_/(y_pred_).sum(dim=1,keepdim=True)
        self.q = mixup_l * self.pred_hist[index]  + (1-mixup_l) * self.pred_hist[index][mix_index]

DiegoOrtego commented 3 years ago

Hi @ShikunLi,

Many thanks for your interest in our work! Much appreciated! Apologies for the delay in getting back to you, but I've been quite busy in the last weeks. Checking CIFAR-100 scripts for ELR, this is what I see:

Epochs: 250 LR: 0.02 decreased at epoch 200 dividing by 10 Batch size: 128 Weight decay: 5e-4 (SGD with momentum 0.9) coef_step: 40000 elr_lambda: 7 (3 for CIFAR-10) elr_beta: 0.9 (0.7 for CIFAR-10)

Best, Diego.

DiegoOrtego / LabelNoiseMOIT

Could you share the Code on mini-WebVision dataset? #1