facebookresearch / vissl

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
https://vissl.ai
MIT License
3.24k stars 330 forks source link

Please read & provide the following #525

Open DavidTorpey opened 2 years ago

DavidTorpey commented 2 years ago

Instructions To Reproduce the šŸ› Bug:

  1. what changes you made (git diff) or what code you wrote file: main.py
    
    from omegaconf import OmegaConf
    from vissl.utils.hydra_config import AttrDict
    from vissl.utils.hydra_config import compose_hydra_configuration, convert_to_attrdict
    from vissl.models import build_model
    from classy_vision.generic.util import load_checkpoint
    from vissl.utils.checkpoint import init_model_from_consolidated_weights
    import torch
    import torch.nn as n

cfg = [ 'config=pretrain/pirl/models/resnet50_mlphead.yaml', 'config.MODEL.WEIGHTS_INIT.PARAMS_FILE=./model_final_checkpoint_phase199.torch', 'config.MODEL.FEATURE_EVAL_SETTINGS.EVAL_MODE_ON=True', # Turn on model evaluation mode. 'config.MODEL.FEATURE_EVAL_SETTINGS.FREEZE_TRUNK_ONLY=False', # Freeze trunk. 'config.MODEL.FEATURE_EVAL_SETTINGS.EXTRACT_TRUNK_FEATURES_ONLY=True', # Extract the trunk features, as opposed to the HEAD. 'config.MODEL.FEATURE_EVAL_SETTINGS.SHOULD_FLATTEN_FEATS=True', # Do not flatten features. ]

cfg = compose_hydraconfiguration(cfg) , cfg = convert_to_attrdict(cfg)

model = build_model(cfg.MODEL, cfg.OPTIMIZER)

weights = load_checkpoint(checkpoint_path=cfg.MODEL.WEIGHTS_INIT.PARAMS_FILE)

init_model_from_consolidated_weights( config=cfg, model=model, state_dict=weights, state_dict_key_name="classy_state_dict", skip_layers=[], )

m = model.trunk p = m(torch.randn((2, 3, 224, 224))) print(p.shape)

2. what exact command you run:

wget https://dl.fbaipublicfiles.com/vissl/model_zoo/pirl_jigsaw_4node_200ep_pirl_jigsaw_4node_resnet_22_07_20.ffd17b75/model_final_checkpoint_phase199.torch python main.py

3. what you observed (including __full logs__):

Traceback (most recent call last): File "pirl_resnet50.py", line 46, in p=m(torch.randn((2, 3, 224, 224))) File "/home/david/ssl-analysis/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/mnt/data/home/david/vissl/vissl/models/trunks/resnext.py", line 184, in forward out = get_trunk_forward_outputs( File "/mnt/data/home/david/vissl/vissl/models/model_helpers.py", line 446, in get_trunk_forward_outputs out_feat_keys = [feature_mapping[f] for f in out_feat_keys] TypeError: 'NoneType' object is not iterable

4. please simplify the steps as much as possible so they do not require additional resources to
   run, such as a private dataset.

## Expected behavior:

If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.

## Environment:

Provide your environment information using the following command:

wget -nc -q https://github.com/facebookresearch/vissl/raw/main/vissl/utils/collect_env.py && python collect_env.py


sys.platform linux Python 3.8.5 (default, Jul 28 2020, 12:59:40) [GCC 9.3.0] numpy 1.19.5 Pillow 8.2.0 vissl 0.1.6 @/mnt/data/home/david/vissl/vissl GPU available True GPU 0,1 Quadro RTX 8000 CUDA_HOME /usr torchvision 0.9.1+cu102 @/home/david/ssl-analysis/venv/lib/python3.8/site-packages/torchvision hydra 1.0.7 @/home/david/ssl-analysis/venv/lib/python3.8/site-packages/hydra classy_vision 0.7.0.dev @/home/david/ssl-analysis/venv/lib/python3.8/site-packages/classy_vision tensorboard 2.4.1 apex unknown cv2 4.5.2 PyTorch 1.8.1+cu102 @/home/david/ssl-analysis/venv/lib/python3.8/site-packages/torch PyTorch debug build False


PyTorch built with:

CPU info:


Architecture x86_64 CPU op-mode(s) 32-bit, 64-bit Byte Order Little Endian Address sizes 43 bits physical, 48 bits virtual CPU(s) 128 On-line CPU(s) list 0-127 Thread(s) per core 2 Core(s) per socket 64 Socket(s) 1 NUMA node(s) 1 Vendor ID AuthenticAMD CPU family 23 Model 49 Model name AMD Ryzen Threadripper 3990X 64-Core Processor Stepping 0 Frequency boost enabled CPU MHz 3112.329 CPU max MHz 2900.0000 CPU min MHz 2200.0000 BogoMIPS 5800.14 Virtualization AMD-V L1d cache 2 MiB L1i cache 2 MiB L2 cache 32 MiB L3 cache 256 MiB NUMA node0 CPU(s) 0-127 Vulnerability Itlb multihit Not affected Vulnerability L1tf Not affected Vulnerability Mds Not affected Vulnerability Meltdown Not affected Vulnerability Spec store bypass Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1 Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2 Mitigation; Full AMD retpoline, IBPB conditional, STIBP conditional, RSB filling Vulnerability Srbds Not affected Vulnerability Tsx async abort Not affected



## When to expect Triage

VISSL devs and contributors aim to triage issues asap however, as a general guideline, we ask users to expect triaging in 1-2 weeks.
iseessel commented 2 years ago

The issues are most likely with the following options:

  'config.MODEL.FEATURE_EVAL_SETTINGS.EVAL_MODE_ON=True', # Turn on model evaluation mode.
  'config.MODEL.FEATURE_EVAL_SETTINGS.FREEZE_TRUNK_ONLY=False', # Freeze trunk.
  'config.MODEL.FEATURE_EVAL_SETTINGS.EXTRACT_TRUNK_FEATURES_ONLY=True', # Extract the trunk features, as opposed to the HEAD.
  'config.MODEL.FEATURE_EVAL_SETTINGS.SHOULD_FLATTEN_FEATS=True', # Do not flatten features.

Can you look at the following tutorial and let me know if this helps? What exactly are you trying to accomplish? https://github.com/facebookresearch/vissl/blob/main/tutorials/Feature_Extraction_V0_1_6.ipynb

@DavidTorpey

DavidTorpey commented 2 years ago

Hi @iseessel Thanks for the help.

I'm essentially trying to extract a pre-trained backbone from a vissl model, and add my own classification layer, and fine-tune it on my own dataset using a vanilla PyTorch training loop (i.e. not using vissl for this fine tuning).

How can I achieve this?