Closed FloCF closed 3 years ago
Hi @FloCF,
It might be coming from the fact that we did not release a new version of VISSL containing the commit of barlow twins: f63a0cee9e3b0c3ed826356210415b6db0be833c, so the pip install vissl
will not install the vissl.optimizers.lars
.
However, I am surprised that we would be pickling some actual modules and not just weights.
Could you please try on you side loading the checkpoint with the installation from source (https://github.com/facebookresearch/vissl/blob/master/INSTALL.md#Install-from-source-in-PIP-environment)?
On our side, we need to verify why we are pickling modules in the checkpoints as this is not a good pattern.
CC: @prigoyal
Hi @FloCF , thank you so much for reaching out about this.
It indeed seems like this model checkpoint is somehow requires the vissl.optimizers.lars
which shouldn't be the case. I can repro the issue as well. Assigning to @jingli9111 to help look into this :)
Hi @QuentinDuval ,
I tried installation from source in colab with the following code:
!pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 -f https://download.pytorch.org/whl/torch_stable.html
!pip install -f https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/py37_cu101_pyt171/download.html apex
# clone vissl repository
!git clone --recursive https://github.com/facebookresearch/vissl.git
# install vissl dependencies
!pip install --progress-bar off -r vissl/requirements.txt
!pip install opencv-python
# update classy vision install to current master
!pip uninstall -y classy_vision
!pip install classy-vision@https://github.com/facebookresearch/ClassyVision/tarball/master
# install vissl dev mode (e stands for editable)
!cd vissl && pip install -e ".[dev]"
# Download model
!wget "https://dl.fbaipublicfiles.com/vissl/model_zoo/barlow_twins/barlow_twins_32gpus_4node_imagenet1k_1000ep_resnet50.torch"
import vissl
import apex
import torch
# Load Barlow Twins weights
barlow_twins = torch.load('barlow_twins_32gpus_4node_imagenet1k_1000ep_resnet50.torch')
This time I got the following error;
/usr/local/lib/python3.7/dist-packages/torch/serialization.py in _load(zip_file, map_location, pickle_module, pickle_file, **pickle_load_args)
851 unpickler = pickle_module.Unpickler(data_file, **pickle_load_args)
852 unpickler.persistent_load = persistent_load
--> 853 result = unpickler.load()
854
855 torch._utils._validate_loaded_sparse_tensors()
ModuleNotFoundError: No module named 'vissl.optimizers'
---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------
I guess this is just a minor issue with the way you saved the Barlow Twins checkpoint. Nevertheless, vissl is great and I am amazed by all the incredible work in SSL coming from FAIR!
Looks like the bug comes from that barlow twins checkpoint contains the optimizer state and barlow_twins['classy_state_dict']['optimizer']['optim']['param_groups']
has an attribute exclude': <function _LARS._exclude_bias_and_norm at 0x7f98dd4b4f80>
This is an optional function for LARS to exclude bias and norms in BN. This part of the code was simply following here: https://github.com/facebookresearch/barlowtwins/blob/e6f34a01c0cde6f05da6f431ef8a577b42e94e71/main.py#L228
The solution should be writing this attribute as boolean instead of feeding a function.
This should be closed by: https://github.com/facebookresearch/vissl/commit/43f230cd05a700426e21b7b79cb018d97198f370
Congrats and many thanks for this awesome repo!
Trying to load Barlow Twins weights which yields No module named 'vissl.optimizers.lars' error.
Yields the following error: