xiangjjj / implicit_alignment

Code for ICML2020 "Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation"
Other
90 stars 10 forks source link

Excuse me, I'm very interested in your article and I have some problem about your code. #3

Closed big-chuan-bro closed 3 years ago

big-chuan-bro commented 3 years ago
  1. About OfficeHome datasets, there are 65 classes, method sample_N_classes_as_a_task in the Class TaskSampler needs a parameter n-way which means select n different classes to form a taks. And in your setting, the value of n-way is 50. I want to know how to promise the 50 classes selected for souce domain and target domain are same. From the code, two TaskSamplers are creadted for two different kinds of batch_Sampler.
  2. According to my own understanding, for source domain use N_Way_K_Shot_BatchSampler to create dataloader and for target domain use SelfTrainingBaseSampler to create dataloader in your setting use SelfTrainingVannilaSampler. Each iteration creates two kinds of dataloader. Whether there is only one batch for each dataloader. I'm not sure if my understanding is correct.
  3. For my own sake , I cannot run your code successfully. Please tell me more details about how to run the code. I am looking forward to your reply. Thank you very much.
xiangjjj commented 3 years ago

Thank you for your interest!

1.1 why do we sample 50 from the 65 classes?

The number of classes we can sample is constrained by the batch size, which is ultimately constrained by the GPU memory. You could sample 65 classes with more GPU memory.

1.2 How to ensure the 50 sampled classes are the same between source and target?

We use a TaskSampler that samples a task, i.e., the 50 classes each time, then sample examples from each of these classes for the source and target domain. We use a Singleton to ensure the task sampler is shared for each dataloader (the source and the target), and use the sampling frequency to control when to sample a new task.

2.1 "for source domain use N_Way_K_Shot_BatchSampler to create dataloader and for target domain use SelfTrainingBaseSampler to create dataloader in your setting use SelfTrainingVannilaSampler."

Yes. Note that the two dataloaders share the same task sampler due to the singleton. See here and here.

2.2 "Each iteration creates two kinds of dataloader."

No, the dataloader is created only once for the whole training process, instead of at each iteration.

  1. "I cannot run your code successfully. Please tell me more details about how to run the code."

Did you follow the instructions to install the code first? This repo is a package and it needs to be installed before running. Apart from that, could you share the error message so that I could figure out why it does not run in your setup?

Thanks!

big-chuan-bro commented 3 years ago

Thank you very much for your reply and your reply is encouraging to me. When I run your code, first I run the ./install.sh and successfully installed ai.domain-adaptation. Then I used make command to run make train-OfficeHome-R2P-implicit. I met the error which says "Legacy autograd function with non-static forward method is deprecated. " RuntimeError: Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function) I search the Internet for ways to address the error. One says because from torch1.3 the method forward() must be static so can use the previous version such as torch1.2 but in your introduction you recommond use torch>= 1.4 . I want to konw how to address thie error.

Second about the code I still have some problems. Firstly in the N_Way_K_Shot_BatchSampler and SelfTrainingBaseSampler, the result of len(self) are both max_iter which means the max iter number. I want to know if it means the number of batch equals the max_iter or the length of data_loader equals the max_iter. Secondly, for explict alignment if means use knn or use prototype-network to get the Pseudo labels while in your model only use pre-trained model in source data first and then use the updated model to predict the Pseudo labels, prototype network and knn are not used.

I want to try DANN+ implicit_alignment to do some try so if the code about DANN+ implicit_alignment has been saved and can tell me how to find it.

Thank you very much for your patience and I am looking forward to your reply. Best wishes!

xiangjjj commented 3 years ago

Good catch. The non-static forward approach is deprecated and I will update the code shortly. Will reply in more detail later. Thanks!

xiangjjj commented 3 years ago

One says because from torch1.3 the method forward() must be static so can use the previous version such as torch1.2 but in your introduction you recommond use torch>= 1.4 . I want to konw how to address thie error.

I just fixed it in my latest commit.

if it means the number of batch equals the max_iter or the length of data_loader equals the max_iter

Yes, the number of batches for the training data equals to max_iter. Apart from these, I also have onepass_loader for evaluation where each example will be loaded exactly once.

Secondly, for explict alignment if means use knn or use prototype-network to get the Pseudo labels while in your model only use pre-trained model in source data first and then use the updated model to predict the Pseudo labels, prototype network and knn are not used.

In my approach, prototype network is not used. But for my explicit approach, I didn't pre-train the model on the source data. It is trained together with the whole model.

if the code about DANN+ implicit_alignment has been saved and can tell me how to find it

The code for DANN+ implicit_alignment is in a private repo and not here. MDD works much better than DANN. To implement DANN, you can reuse my gradient reversal layer, and change the adversarial loss function.

big-chuan-bro commented 3 years ago

I am very happy to receive your reply again and your answer help me a lot and I have successfully run your code. I still have some problems with your code. I hope you can guide me again. Thanks a lot.

From train() menthod in training.py: ` for i_th_iter in range(args.train_steps):

    if args.self_train is True and i_th_iter % args.yhat_update_freq == 0:
        data_loader_manager.update_self_training_labels(model_instance)

    source_loader, target_loader = data_loader_manager.get_train_source_target_loader()
    _, inputs_source, labels_source = next(iter(source_loader))
    _, inputs_target, labels_target = next(iter(target_loader))`

the code above means every yhat_update_freq the Pseudo labels will be updated but I want to know if the dataloader for source and target will be created again? And each time the length of source dataloader and target dataloader are both equals args.train_steps? Then source and target dataloader are not all used up they will be created again beacuse the Pseudo labels has updated. I don't know my understanding true or false.

Thank you very much and I am looking forward to your reply. Sorry to bother you again。 Best wishes!

xiangjjj commented 3 years ago

if the dataloader for source and target will be created again?

No, the source_loader and target_loader are created once and only once.

And each time the length of source dataloader and target dataloader are both equals args.train_steps?

Yes, the length of source dataloader and target dataloader are both equals args.train_steps.

Then source and target dataloader are not all used up they will be created again beacuse the Pseudo labels has updated.

No. the dataloader and sampler are sort of independent of each other. When a dataloader is created, the sampler is passed as a parameter see here and here. This means there is no need to re-create the dataloader, because when the pseodulabels are updated, the sampler is updated accordingly which controls the behaviour of the dataloader. In other words, although the dataloader is created only once, the batches from this dataloader is decided on-the-fly based on the pseudolabels stored in the sampler. We only need to inform the sampler about the changes in pseudo-labels for the dataloader to adapt to the new pseudo-labels.

I recommend you go through these tutorials: https://pytorch.org/tutorials/beginner/data_loading_tutorial.html, https://pytorch.org/docs/stable/data.html, https://github.com/pytorch/tutorials/issues/78, https://discuss.pytorch.org/t/how-to-use-my-own-sampler-when-i-already-use-distributedsampler/62143/8.

big-chuan-bro commented 3 years ago

Thank you very much for your previous guidance and thank you for your patience. Last week, I use your sampler and implement DANN+Implicit. In your article 4.4.4, you do some experiments on SVHN -> MNIST and MNIST -> SVHN

FROM the digits.py I know the dataset MNIST and SVHN both have two state: mild_unbalance and extreme_unbalance.

1)the source domain is balanced while the target domain isimbalanced. 图片 I want to know the setting. Does it means that Source with mild_unbalance and Target with extreme_unbalance? Source and Target are in the RS-UT condition? And the column under the Source means the per-class acc on the Source dataset itself?

2)the source domain is imbalanced while the target domainis balanced. 图片 the second situation means the Source with extreme_unbalance and the Target with mild_unbalance? They are in the RS-UT condition?

3)mismatched prior where both domains are imbalanced. 图片 the third situation means that the Source and Target both with extreme_unbalance? They are in the RS-UT condition?

Thank you so much and I am looking forward to your reply.

xiangjjj commented 3 years ago

Sorry for the confusion, the coding was missing the balance part. I commented out the slicing section for balanced and didn't define it in the code.

FROM the digits.py I know the dataset MNIST and SVHN both have two state: mild_unbalance and extreme_unbalance.

I just updated the code with the new balanced option: it has three states now.

Source and Target are in the RS-UT condition?

RS-UT is implemented by reversed(), note the difference between this and the reversed one.

I hope this makes more sense now.

big-chuan-bro commented 3 years ago

Thank you very much for your reply again. I am so grateful for your patience!

From your Yesterday's reply, I know that the setting of experiments in your article 4.4.4 contains tow domains(Source domain and Target domain) and each domain has three states (balanced, mild unbalanced and extre_unbalanced). I want to konw more details about the settings in 4.4.4.

For the first setting, the Source is balanced if means that the Source uses all digits data? For example, like SVHN -> MNIST the first two columns of Table6, the Source in 'balanced' state use all data while the Target in 'mild unbalanced' state corresponding to the first column and the the Target in 'extre_unbalanced' statae corresponding to the second column? And I want to know the Source and the Target are in the ‘RS->UT’ condition for the first two columns of Table6 ? image

I don't know my understand right or wrong I hope you explain the setting for the the first two columns of Table6. I can understand other settings better.

For the third setting of table8, I want to know the the first two columns of Table8. My understanding is that the first column of table8 means that the Source in 'mild unbalanced' state and the target in the 'extre_unbalanced' state while the second column means that the Source in 'extre_unbalanced' state and the Target in 'mild unbalanced' state?In these two situation, the Source and the Target are both in the ‘RS->UT’ condition? image

I am looking forward to your reply. Thank you so much! Best wished

xiangjjj commented 3 years ago

For the first setting, the Source is balanced if means that the Source uses all digits data?

Yes.

For example, like SVHN -> MNIST the first two columns of Table6, the Source in 'balanced' state use all data while the Target in 'mild unbalanced' state corresponding to the first column and the the Target in 'extre_unbalanced' statae corresponding to the second column?

Yes.

And I want to know the Source and the Target are in the ‘RS->UT’ condition for the first two columns of Table6 ?

Here the source is not RS, but is balanced while the target is UT.

My understanding is that the first column of table8 means that the Source in 'mild unbalanced' state and the target in the 'extre_unbalanced' state while the second column means that the Source in 'extre_unbalanced' state and the Target in 'mild unbalanced' state?

Note, mild and extreme do not correspond to the domain names above them (SVHN and MNIST). The first column refers to both domains are mild unbalanced while the second column refers to both domains are extreme unbalanced. But you could try mild->extreme if you want to.

In these two situation, the Source and the Target are both in the ‘RS->UT’ condition?

Yes. ‘RS->UT’ is implemented by reversed() as I mentioned before.

Please feel free to let me know if you have any further questions.

big-chuan-bro commented 3 years ago

Happy New Year! Thanks again for your guidance and help!

Now I implement the DANN+Implicit according your code but I cannot achieve the result under the setting from SVHN -> MNIST. I use the model trained only on the source data , the result is not as good as the result in the article. It may be that my model structure and hyper-parameters are not set well

I want to know can I get the code of your DANN and DANN+implicit. Please give me some guidance again! OR tell me how can I find the private repo or how can I contact the coder.

I am looking forward to your reply. Thank you so much! Happy New Year! Best wishes!

xiangjjj commented 3 years ago

Hi, I will try to upload the digits code into a separate repo in the next few days. Happy new year, too!

xiangjjj commented 3 years ago

Hi, I have uploaded the digits code with DANN+I.A. into this repo https://github.com/xiangdal/implicit_alignment_digits. Running that repo with makefile should yield the following average accuracy per class, i.e., balanced_accuracy (SVHN->MNIST, RS-UT, mild_unbalance_source, mild_unbalance_target):

Note, DANN is implemented based on modifications of the MDD code (with some comments). Please let me know if that is what you were looking for.

(BTW, I tried to integrate digits&DANN into this repo, but too many changes were made and it is very hard to integrate without introducing potential bugs.)

big-chuan-bro commented 3 years ago

Thank you again for your kindness! Your help is very encourage to me! I have run your new code about digits successfully! I will learn your code carefully and try to do some experiments under different degrees of imbalance using SVHN、MNIST、USPS three digits dataset!

Thanks again for your efforts and help!

Best wishes to you! Thank you!