Closed DRJ2016 closed 3 years ago
The initialization provides uniform weights of channels at the beginning of training since we use softmax function after the linear layer. https://github.com/alinlab/L2T-ww/blob/adf8ca39ed7ead246995ae60c5a5ea0fd4f46d77/train_l2t_ww.py#L89
I also tried to run the code, but I obtained a similar result reported in our paper. In this experiment, I used a recent Pytorch version (pytorch=1.6.0
and cudatoolkit=10.1
).
[2020-10-17 02:02:19,862] [main] [Epoch 199] [val 61.6000] [test 65.8958] [best 65.6369]
So, I'm not sure why your experiment fails. Could you check again the data path and pytorch version?
Thank you for your replay.
https://github.com/alinlab/L2T-ww/blob/adf8ca39ed7ead246995ae60c5a5ea0fd4f46d77/train_l2t_ww.py#L80 Why should the weight of linear layers in WeightNetwork be initialized as 0 here?
When I tried to run
python train_l2t_ww.py --dataset cub200 --datasplit cub200 --dataroot /data/CUB_200_2011
, the accuracy didn't improve with epochs at all, it keeped very low during all the training.