tim-learn / SHOT

code released for our ICML 2020 paper "Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation"
MIT License
437 stars 77 forks source link

No ReLU between final two FC layers -- object datasets #14

Closed cianeastwood closed 3 years ago

cianeastwood commented 3 years ago

Hello!

Thanks for the great codebase -- I've found it very useful, and a great resource to try and reproduce the results in your interesting paper!

We're trying to reproduce some of the results, and noticed that you stack two FC/linear layers without a non-linearity in between them. I believe that this is only for the object datasets, and happens between the bottleneck and classifier layers. Is there a reason you have done this? It seems quite unusual since, without a nonlinearity in between, the two layers can be collapsed into a single equivalent layer.

Thanks for your help! Cian

tim-learn commented 3 years ago

@cianeastwood Thanks for you interests!

Yeah, you are right, two linear layers can be collapsed into a single layer for classification. But for DA, since the output of backbone network is quite long (2,048), we introduce the bottleneck layer (256) for better feature alignment in a low-dimensional feature space, just the same as DANN and CDAN (https://github.com/thuml/CDAN/blob/master/pytorch/network.py).

Best

cianeastwood commented 3 years ago

Yeah I understand that you want to do feature alignment in a lower dim space – I just don't get why there is no ReLU in between! But ok, I guess it's common in DA. Thanks for your response!