alinlab / L2T-ww

Learning What and Where to Transfer (ICML 2019)
MIT License
250 stars 48 forks source link

Weight initialization in WeightNetwork and LossWeightNetwork #7

Closed DRJ2016 closed 3 years ago

DRJ2016 commented 4 years ago

https://github.com/alinlab/L2T-ww/blob/adf8ca39ed7ead246995ae60c5a5ea0fd4f46d77/train_l2t_ww.py#L80 Why should the weight of linear layers in WeightNetwork be initialized as 0 here?

When I tried to run python train_l2t_ww.py --dataset cub200 --datasplit cub200 --dataroot /data/CUB_200_2011, the accuracy didn't improve with epochs at all, it keeped very low during all the training.

hankook commented 3 years ago

The initialization provides uniform weights of channels at the beginning of training since we use softmax function after the linear layer. https://github.com/alinlab/L2T-ww/blob/adf8ca39ed7ead246995ae60c5a5ea0fd4f46d77/train_l2t_ww.py#L89

I also tried to run the code, but I obtained a similar result reported in our paper. In this experiment, I used a recent Pytorch version (pytorch=1.6.0 and cudatoolkit=10.1).

[2020-10-17 02:02:19,862] [main] [Epoch 199] [val 61.6000] [test 65.8958] [best 65.6369]

So, I'm not sure why your experiment fails. Could you check again the data path and pytorch version?

DRJ2016 commented 3 years ago

Thank you for your replay.