fungtion / DANN_py3

python 3 pytorch implementation of DANN
MIT License
507 stars 96 forks source link

The domain classifier loss is not decreasing. #12

Open zhongpeixiang opened 1 year ago

zhongpeixiang commented 1 year ago
image

As shown in the image above, the domain classifier loss is almost constant throughout the training process. I use a ViT as the feature extractor, a linear layer as label classifier, and a two-layer MLP as the domain classifier.

What are the possible causes? and what are the typical loss curves for domain classifier?

Thanks

cs-mshah commented 1 year ago

Even I am facing this issue. The DANN loss converges quite fast. Here are the plots of this repository without any changes to the code: wandb-DANN. In my other case I tried using a resnet18 and the domain classifier similar to the classifier part of resnet. There also the loss almost instantly stabilised to around 0.7. Is there any good repository where we can clearly understand how DANN works and a CLEAR practical working example of stable training?

taotaowang97479 commented 4 months ago

Even I am facing this issue. The DANN loss converges quite fast. Here are the plots of this repository without any changes to the code: wandb-DANN. In my other case I tried using a resnet18 and the domain classifier similar to the classifier part of resnet. There also the loss almost instantly stabilised to around 0.7. Is there any good repository where we can clearly understand how DANN works and a CLEAR practical working example of stable training?

I had the same problem. I used the code and data set given by the author, and there was no problem in the training. With the progress of training, the loss of the domain classifier stabilized at 0.65-0.67, and the training loss was shown in the figure. train_loss However, with my own network, the same training method, the domain classifier loss is stable at 0.69 from the beginning, which seems to indicate that the domain classifier is not learning, it will randomly classify the source domain or target domain samples with 50% probability. Can anyone figure out how to fix it?

cs-mshah commented 4 months ago

I had shifted to a more robust codebase: https://github.com/thuml/Transfer-Learning-Library

taotaowang97479 commented 4 months ago

I had shifted to a more robust codebase: https://github.com/thuml/Transfer-Learning-Library

So is it the code itself? I went to this codabase and looked at the same DANN code and I didn't think there was a big difference in the way it was written