lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
MIT License
20.57k stars 3.05k forks source link

ViT and BYOL runtime error during training #69

Closed suarezjessie closed 3 years ago

suarezjessie commented 3 years ago

I've copied and pasted the same code in the README for Self-Supervised training and it comes out with this error

Traceback (most recent call last):
  File "train_scratch_byol.py", line 18, in <module>
    hidden_layer = 'to_latent'
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/site-packages/byol_pytorch/byol_pytorch.py", line 210, in __init__
    self.forward(torch.randn(2, 3, image_size, image_size, device=device))
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/site-packages/byol_pytorch/byol_pytorch.py", line 240, in forward
    target_encoder = self._get_target_encoder() if self.use_momentum else self.online_encoder
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/site-packages/byol_pytorch/byol_pytorch.py", line 27, in wrapper
    instance = fn(self, *args, **kwargs)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/site-packages/byol_pytorch/byol_pytorch.py", line 214, in _get_target_encoder
    target_encoder = copy.deepcopy(self.online_encoder)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/copy.py", line 281, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/copy.py", line 241, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/copy.py", line 241, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/copy.py", line 161, in deepcopy
    y = copier(memo)
  File "/usr/local/Caskroom/miniconda/base/envs/thesis_proposal/lib/python3.7/site-packages/torch/tensor.py", line 47, in __deepcopy__
    raise RuntimeError("Only Tensors created explicitly by the user "
RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment
lucidrains commented 3 years ago

@suarezjessie I may have introduced a bug in the latest commit to BYOL, do you want to give 0.5.4 of byol-pytorch a try?

suarezjessie commented 3 years ago

@lucidrains used the 0.5.4 version and it works like a charm! Thanks!