PyTorch demo code for paper Multiple Source Domain Adaptation with Adversarial Learning and Adversarial Multiple Source Domain Adaptation by Han Zhao, Shanghang Zhang, Guanhang Wu, João Costeira, José Moura and Geoff Gordon.
MDAN is a method for domain adaptation with multiple sources. Specifically, during training, a set of $k$ domains, represented by $k$ labeled source datasets, together with one unlabeled target dataset, are used to train the model jointly. A schematic representation of the overall model during the training phase is shown in the following figure:
Essentially, MDAN contains three components:
At a high level, in each iteration:
Since we have $k$ domain classifiers, to define an overall reward for the set of $k$ domain classifiers, we develop two variants of MDAN:
More detailed description about these two variants could be found in Section 4 of the paper Adversarial Multiple Source Domain Adaptation.
It is notoriously hard to optimize minimax problem when it is nonconvex. Our goal is to converge to a saddle point. In this code repo we use the double gradient descent method, e.g., the primal-dual gradient method, to optimize the objective function. Intuitively, this means that we use simultaneous gradient updates for all the components in the model. As a comparison, in block coordinate method, we would either fix the set of $k$ domain classifiers or the feature extractor and the hypothesis, and optimize the other until convergence, and then iterate from there.
Specifically, we use the well-known gradient reversal layer to implement this method. Code snippet in PyTorch shown as follows:
class GradientReversalLayer(torch.autograd.Function):
"""
Implement the gradient reversal layer for the convenience of domain adaptation neural network.
The forward part is the identity function while the backward part is the negative function.
"""
def forward(self, inputs):
return inputs
def backward(self, grad_output):
grad_input = grad_output.clone()
grad_input = -grad_input
return grad_input
Python 3.6.6
PyTorch >= 1.0.0
Numpy
Scipy
This part explains how to reproduce the Amazon sentiment analysis experiment in the paper.
Run
python main_amazon.py -o [maxmin|dynamic]
Here maxmin
corresponds to the Hard-Max variant and dynamic
corresponds to the Soft-Max variant.
Several practical suggestions on training these models:
--mu
hyperparameter that corresponds to the coefficient for the domain adversarial loss. This hyperparameter is dataset dependent and should be chosen appropriately for different datasets. If you use this code for your research and find it helpful, please cite our paper Multiple Source Domain Adaptation with Adversarial Learning or Adversarial Multiple Source Domain Adaptation:
@article{zhao2018multiple,
title={Multiple source domain adaptation with adversarial learning},
author={Zhao, Han and Zhang, Shanghang and Wu, Guanhang and Moura, Jos{\'e} MF and Costeira, Joao P and Gordon, Geoffrey J},
booktitle={International Conference on Learning Representations, workshop track},
year={2018}
}
or
@inproceedings{zhao2018adversarial,
title={Adversarial multiple source domain adaptation},
author={Zhao, Han and Zhang, Shanghang and Wu, Guanhang and Moura, Jos{\'e} MF and Costeira, Joao P and Gordon, Geoffrey J},
booktitle={Advances in Neural Information Processing Systems},
pages={8568--8579},
year={2018}
}
Please email to han.zhao@cs.cmu.edu should you have any questions, comments or suggestions.