PatrickHua / SimSiam

A pytorch implementation for paper 'Exploring Simple Siamese Representation Learning'
MIT License
814 stars 135 forks source link

Strange errors when running cifar_experiment.sh #12

Closed liu00222 closed 3 years ago

liu00222 commented 3 years ago

The OS is Ubuntu 18.04. The environment is in the conda environment as indicated with all required dependencies in requirements.txt installed.

The script in the debug mode runs well. However, when I ran:

sh configs/cifar_experiment.sh

A strange error happened during the evaluation time:

Training: 100%|██████████| 800/800 [6:17:14<00:00, 28.29s/it, epoch=799, loss_avg=-.878]

Evaluating:   0%|          | 0/30 [00:00<?, ?it/s]Model saved to outputs/cifar10_experiment/simsiam-cifar10-epoch800.pth
Files already downloaded and verified

Evaluating:   0%|          | 0/30 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "main.py", line 116, in <module>
    main(args=get_args())
  File "main.py", line 113, in main
    linear_eval(args, backbone)
  File "/home/yl764/SimSiam/SimSiam/linear_eval.py", line 109, in main
    feature = model(images.to(args.device))
  File "/home/yl764/miniconda3/envs/simsiam/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/yl764/miniconda3/envs/simsiam/lib/python3.8/site-packages/torchvision/models/resnet.py", line 220, in forward
    return self._forward_impl(x)
  File "/home/yl764/miniconda3/envs/simsiam/lib/python3.8/site-packages/torchvision/models/resnet.py", line 203, in _forward_impl
    x = self.conv1(x)
  File "/home/yl764/miniconda3/envs/simsiam/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/yl764/miniconda3/envs/simsiam/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "/home/yl764/miniconda3/envs/simsiam/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
    return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Thanks!

PatrickHua commented 3 years ago

I merged someone's pull request for ddp and something unexpected happened ... Will fix it asap!

PatrickHua commented 3 years ago

I think now the problem is gone!