Mephisto405 / Learning-Loss-for-Active-Learning

Reproducing experimental results of LL4AL [Yoo et al. 2019 CVPR]
215 stars 50 forks source link

why not reinstantiate the network model in Active learning cycles?I am wondering that the way of your writing will make the model aware of the test set in advance? #5

Closed Jiawen-huang closed 4 years ago

Jiawen-huang commented 4 years ago

Dear programmer, why not reinstantiate the network model in Active learning cycles?A lot of the active learning code I see reinstantiate the network model in Active learning cycles. For example,

Active learning cycles

    for cycle in range(CYCLES):
         # 2、Model(target model + loss predict module)
         resnet18    = resnet.ResNet18(num_classes=10).cuda()
         loss_module = lossnet.LossNet().cuda()
         models      = {'backbone': resnet18, 'module': loss_module}
         torch.backends.cudnn.benchmark = True
        #criterion、optimizer and scheduler (re)initialization
        criterion      = nn.CrossEntropyLoss(reduction='none')#交叉熵loss
        optim_backbone = optim.SGD(models['backbone'].parameters(), lr=LR,
                                momentum=MOMENTUM, weight_decay=WDECAY)
        optim_module   = optim.SGD(models['module'].parameters(), lr=LR, 。。。。。

In your code,You write it like this. I am wondering that the way of your writing will make the model aware of the test set in advance? https://github.com/Mephisto405/Learning-Loss-for-Active-Learning/blob/6c38f6fe8becf0739689ad0962c0379c0c9c2131/main.py#L226

Mephisto405 commented 4 years ago

Hello,

Thanks for your comment. However, I cannot understand the point. Could you give me more details about 're-instantiation' that you mentioned? I mean, what do you mean by 're-instantiate the network model'?

Thanks

lingjunz commented 4 years ago

hi, I have a similar question.

In the source code, I find you only load model structure once. In each active learning cycle, the start point for retraining is the pre-trained model from the previous cycle.

Why don't load the model in the active learning cycle, i.e. re-instantiate the model before each active learning cycle begins?

Thanks in advance!

        torch.backends.cudnn.benchmark = True
        # Model ===========> put this three lines into active learning cycles
        resnet18    = resnet.ResNet18(num_classes=10).cuda()
        loss_module = lossnet.LossNet().cuda()
        models      = {'backbone': resnet18, 'module': loss_module}

        # Active learning cycles
        for cycle in range(CYCLES):
            # Loss, criterion and scheduler (re)initialization
            criterion      = nn.CrossEntropyLoss(reduction='none')
            optim_backbone = optim.SGD(models['backbone'].parameters(), lr=LR, 
                                    momentum=MOMENTUM, weight_decay=WDECAY)
            optim_module   = optim.SGD(models['module'].parameters(), lr=LR, 
                                    momentum=MOMENTUM, weight_decay=WDECAY)

Another question: When you compare LLAL with other methods, e.g. entropy, core-set, do you still use the model structure with the loss prediction module and train models with LossPredLoss?

Mephisto405 commented 4 years ago

I am wondering that the way your writing will make the model aware of the test set in advance?

-> I think this is not the case. I don't feed the label of "unlabeled_set" into the model in advance. The model only knows the labels after it predicts the loss of each (sampled) unlabeled data points. In the "get_uncertainty" method, I turned off the training mode (model.eval()) as well. -> Also, in the real-world scenario, feeding unlabeled data into a model (to predict loss) is feasible.

Why not re-instantiate the network model? A lot of the active learning code I see re-instantiate the network model in Active learning cycles.

-> Hmm, in fact, I'm not so familiar with the whole literature of active learning. So if you find that the re-initialization if more effective in the sense of test accuracy, I think that is great. I just cannot see whether I need to re-initialize the model in the paper so that I do not re-initialize the model. We don't need to discard the prior knowledge learned from the previous learning cycle, do we? -> I think there is no problem even if we don't re-initialize the model in each cycle. The problem setting is that 'we can gather unlabeled dataset, but we cannot obtain labels of them.' So, I think feeding unlabeled data into a model does not make a problem if we don't put labels on them.

Another question: When you compare LLAL

Sorry, I'm not the author of this paper, and I cannot answer this question. I did not compare the LL4AL method with entropy, core-set, etc. in this repository.

Thanks for all comments, and if you have any comments and thoughts, I always welcome them. And please point it out if I'm wrong (I'm just a newbie master student :)). I want to share idea with many people.