easezyc / deep-transfer-learning

A collection of implementations of deep domain adaptation algorithms
MIT License
889 stars 205 forks source link

Questions about RevGrad #21

Closed YangMengHsuan closed 3 years ago

YangMengHsuan commented 3 years ago

Hi @easezyc , thanks for your great implementations. When I'm trying RevGrad in pytorch1.0, I have some questions. Would you help me?

  1. In the original paper, it said the optimizer was set as momentum=0.9. However, in line62, the optimizer will be created every iteration, which means momentum will be reset every time. https://github.com/easezyc/deep-transfer-learning/blob/cc97b7d248b7e7d9b187a3bae99eb560c458f89c/UDA/pytorch1.0/RevGrad/RevGrad.py#L62

  2. The optimizer_critic seems do not optimizer_critic.step(). https://github.com/easezyc/deep-transfer-learning/blob/cc97b7d248b7e7d9b187a3bae99eb560c458f89c/UDA/pytorch1.0/RevGrad/RevGrad.py#L63

  3. I tried to solve the questions, but I cannot reproduce the reported results. My modifications are below.

    
    # defined optimizer_fea before the training loop
    optimizer_fea = torch.optim.SGD([
        {'params': model.sharedNet.parameters(), 'base_lr': lr/10},
        {'params': model.cls_fn.parameters(), 'base_lr': lr},
        {'params': model.domain_fn.parameters(), 'base_lr': lr}
        ], lr=lr, momentum=momentum, weight_decay=l2_decay)

. . .

for i in range(1, iteration+1):

update learning rate during training

    for param_group in optimizer_fea.param_groups:
        param_group['lr'] = param_group['base_lr'] / math.pow((1 + 10 * (i - 1) / iteration), 0.75)


Did I miss anything? Thanks for your help!
easezyc commented 3 years ago
  1. This is a problem in my code. The optimizer should be defined before iteration. Actually, I also try to move the optimizer to right position. However, I find the performance is similar.
  2. Revgrad of pytorch1.0 is written by an intern. There are some problems. I recommend you to refer to DANN in https://github.com/jindongwang/transferlearning.
  3. Refer to 2,
YangMengHsuan commented 3 years ago

Thanks for your reply and suggestions!

Currently I'm trying to reproduce the reported RevGrad in this paper, but I have struggled with this for a few days.

  1. Did you try on pytorch0.3?
  2. thanks for the reference, I will look into it.
easezyc commented 3 years ago

I have tried to reproduce it, too. However, I failed to get the same results as the paper mentioned. I think you can ask the students of professor Mingsheng Long for the code (http://ise.thss.tsinghua.edu.cn/~mlong/).

easezyc commented 3 years ago

I have tried to reproduce it, too. However, I failed to get the same results as the paper mentioned. I think you can ask the students of professor Mingsheng Long for the code (http://ise.thss.tsinghua.edu.cn/~mlong/).

YangMengHsuan commented 3 years ago

okay, thanks for your help! If I can reproduce the results, I will update here.

YangMengHsuan commented 3 years ago

Hi @easezyc

After I contacted the MADA writer, he suggested me to follow this repo Now I can reproduce the reported results!

The main different are as following:

  1. learning rate scheduler
  2. lambda schedule
  3. the bottleneck after average pooling layer
  4. concat source and target inputs to do model forward
easezyc commented 3 years ago

Thanks a lot.