ViCCo-Group / thingsvision

Python package for extracting representations from state-of-the-art computer vision models
https://vicco-group.github.io/thingsvision/
MIT License
157 stars 21 forks source link

Wrong layers (names?) when extracting all conv layers from AlexNet. #145

Closed spaladin closed 1 year ago

spaladin commented 1 year ago

Hi!

I am using the Collab notebook for PyTorch. I tried Alexnet – by changing some parameters in the notebook examples (VGG-16 with batch norm pretrained on ImageNet) I wanted to extract activations from all the convolutional layers, so I used the following code modification:

pretrained = True
model_path = None
batch_size = 32
apply_center_crop = True
flatten_activations = True
class_names = None
file_names = None
file_format = "txt"
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model_name = 'alexnet' 

extractor = get_extractor(
    model_name=model_name,
    pretrained=pretrained,
    model_path=model_path,
    device=device,
    source=source
)
layer = nn.Conv2d
features_conv_layers = extract_all_layers(
    model_name=model_name,
    extractor=extractor,
    image_path=full_image_path,
    out_path=full_output_path,
    batch_size=batch_size,
    flatten_activations=flatten_activations,
    apply_center_crop=apply_center_crop,
    layer=layer,
    file_format=file_format,
    class_names=class_names,
    file_names=file_names,
)

The number of extracted layers (5) is correct, however, their names are wrong. Also, they don’t seem to match activations if I extract them layer by layer.

import pprint
pprint.pprint(features_conv_layers)
> {'layer_03': array([[ 0.12578183, -1.6219227 , -1.70176   , ..., -0.09369432,
>          0.08984605,  0.53728145],
>        [-0.7437832 ,  0.4840992 , -1.0749123 , ..., -0.5906651 ,
>         -0.0103681 ,  0.85010654],
>        [-7.1971655 , -7.940407  , -6.6123657 , ..., -0.03713347,
>          0.17685306,  0.20328899],
>        ...,
>        [-0.11900061,  0.45805246, -0.24691838, ...,  0.43640044,
>          0.60965   ,  0.15608627],
>        [-3.3795178 , -0.8164063 ,  1.241787  , ..., -0.15670502,
>          0.37275872, -1.1117904 ],
>        [-1.8591139 ,  0.42655402, -1.2627366 , ...,  0.13208538,
>          0.7406758 , -0.08220328]], dtype=float32),
>  'layer_06': array([[  0.9620508 ,   0.30162707,  -0.1934116 , ...,   1.858568  ,
>           1.1076739 ,   2.6782532 ],
>        [ -2.981957  ,  -2.7508926 ,   2.3697217 , ...,  -3.5808048 ,
>          -3.423941  ,  -0.64591765],
>        [  4.2801642 ,  -0.83435506,   2.985959  , ...,   6.7334747 ,
>           6.2881317 ,   5.9027047 ],
>        ...,
>        [  1.0700345 ,   3.1900558 ,  10.337312  , ..., -15.291744  ,
>         -13.26463   ,  -8.463119  ],
>        [  4.113041  ,  -0.5327738 ,  -7.635279  , ..., -13.555077  ,
>         -11.066146  ,  -6.3253107 ],
>        [  7.7316403 ,   4.5360703 ,   2.8130543 , ...,  -1.4696633 ,
>          -1.934596  ,  -2.111609  ]], dtype=float32),
>  'layer_09': array([[ -2.7296655 ,  -3.8743975 ,  -4.42631   , ...,  -1.769564  ,
>          -9.306692  ,  -3.7036123 ],
>        [ -0.04922827,   8.507316  ,   8.73079   , ...,  -1.6609691 ,
>          -2.1132905 ,   1.4741383 ],
>        [  1.260469  ,   2.5003388 ,   6.413993  , ..., -13.279748  ,
>         -11.688034  ,  -0.75490063],
>        ...,
>        [ -5.667928  ,  -4.167725  ,  -3.8533154 , ...,  -5.4969683 ,
>          -5.595151  ,  -4.3243084 ],
>        [-13.667816  , -12.076218  , -13.503252  , ...,  -9.637596  ,
>         -10.82891   ,  -7.7699695 ],
>        [ -3.5512943 ,  -6.506268  ,  -8.565321  , ...,  -8.342162  ,
>          -5.3187623 ,  -3.7092364 ]], dtype=float32),
>  'layer_11': array([[ -1.8959136 ,  -0.49840185,   1.8974366 , ...,  -0.53857136,
>          -2.0079157 ,  -0.99538684],
>        [ -6.2139473 ,  -6.6683593 ,   0.59373176, ...,  -2.49711   ,
>          -0.7762851 ,   5.3251452 ],
>        [  0.98107946,   2.1276023 ,   0.21201959, ...,  -5.2047086 ,
>          -2.7088084 ,  -3.1409965 ],
>        ...,
>        [  0.8351224 ,  -2.8610208 ,  -0.6101372 , ...,  -7.19401   ,
>          -6.187467  ,  -3.8016884 ],
>        [ -0.98556584,   3.609405  ,   5.1768856 , ..., -12.374811  ,
>         -10.80249   ,  -9.2378645 ],
>        [ -1.0655197 ,  -2.2648356 ,   0.3327253 , ...,  -1.8608872 ,
>          -1.464742  ,   2.7857661 ]], dtype=float32),
>  'layer_13': array([[-5.751322 , -7.13288  , -7.124266 , ..., -5.815998 , -5.2186804,
>         -1.9163358],
>        [-3.6160367, -5.2578096, -4.3458047, ..., -6.0499434, -5.955589 ,
>         -3.924894 ],
>        [-1.8869516, -8.714518 , -6.626626 , ...,  0.0618896, -6.0121365,
>         -7.5943756],
>        ...,
>        [-1.597339 , -2.036482 , -2.903429 , ..., -8.830565 , -8.896645 ,
>         -8.055616 ],
>        [-2.4018128, -5.483816 , -5.364626 , ..., -7.1737504, -6.7403235,
>         -4.4299064],
>        [-6.647602 , -5.406938 , -3.681465 , ..., -3.7315614, -4.8392906,
>         -4.518252 ]], dtype=float32)}
LukasMut commented 1 year ago

Hi there! This is because the keys for the features_per_layer dictionary that is used to store the activations for each module in the extract_all_layers convenience function (see below) correspond to custom names and not to the original module names. You can replace features_per_layer[f'layer_{l:02d}'] with features_per_layer[f'{module_name}'] for using original module names. See my comments in the below function.

def extract_all_layers(
    model_name: str,
    extractor: Any,
    image_path: str,
    out_path: str,
    batch_size: int,
    flatten_activations: bool,
    apply_center_crop: bool,
    layer: Any=nn.Linear,
    file_format: str = "npy",
    class_names: Optional[List[str]]=None,
    file_names: Optional[List[str]]=None,
) -> Dict[str, Union[np.ndarray, torch.Tensor]]:
    """Extract features for all selected layers and save them to disk."""
    features_per_layer = {}
    for l, (module_name, module) in enumerate(extractor.model.named_modules(), start=1):
        if isinstance(module, layer):
            # extract features for layer "module_name"
            features = extract_features(
                extractor=extractor,
                module_name=module_name,
                image_path=image_path,
                out_path=out_path,
                batch_size=batch_size,
                flatten_activations=flatten_activations,
                apply_center_crop=apply_center_crop,
                class_names=class_names,
                file_names=file_names,
            )
            # NOTE: for original module names use features_per_layer[f'{module_name}'] = features
            # replace with e.g., [f'conv_{l:02d}'] or [f'fc_{l:02d}']
            features_per_layer[f'layer_{l:02d}'] = features
            # save the features to disk
            save_features(features, out_path=f'{out_path}/features_{model_name}_{module_name}', file_format=file_format)
    return features_per_layer
spaladin commented 1 year ago

Hi Lukas! Thanks for the quick response!

I guess, I am missing something, but after replacing the line as suggested, I get only one key named 'module_name' like this:

> {'module_name': array([[-5.751324 , -7.1328826, -7.124266 , ..., -5.815997 , -5.2186813,
>         -1.916337 ],
>        [-3.6160486, -5.257819 , -4.3458114, ..., -6.049942 , -5.9555883,
>         -3.9248927],
>        [-1.8869545, -8.714518 , -6.6266246, ...,  0.0618932, -6.012134 ,
>         -7.594375 ],
>        ...,
>        [-1.5973396, -2.0364807, -2.9034288, ..., -8.830564 , -8.896644 ,
>         -8.055617 ],
>        [-2.4018128, -5.4838195, -5.3646264, ..., -7.173752 , -6.7403235,
>         -4.4299088],
>        [-6.647603 , -5.406933 , -3.6814663, ..., -3.7315552, -4.8392887,
>         -4.5182486]], dtype=float32)}
LukasMut commented 1 year ago

Hi Lukas! Thanks for the quick response!

I guess, I am missing something, but after replacing the line as suggested, I get only one key named 'module_name' like this:

> {'module_name': array([[-5.751324 , -7.1328826, -7.124266 , ..., -5.815997 , -5.2186813,
>         -1.916337 ],
>        [-3.6160486, -5.257819 , -4.3458114, ..., -6.049942 , -5.9555883,
>         -3.9248927],
>        [-1.8869545, -8.714518 , -6.6266246, ...,  0.0618932, -6.012134 ,
>         -7.594375 ],
>        ...,
>        [-1.5973396, -2.0364807, -2.9034288, ..., -8.830564 , -8.896644 ,
>         -8.055617 ],
>        [-2.4018128, -5.4838195, -5.3646264, ..., -7.173752 , -6.7403235,
>         -4.4299088],
>        [-6.647603 , -5.406933 , -3.6814663, ..., -3.7315552, -4.8392887,
>         -4.5182486]], dtype=float32)}

I am sorry. It should read features_per_layer[f'{module_name}'] rather than features_per_layer[f'module_name']. This was just a typo in my previous comment (I've corrected it).

spaladin commented 1 year ago

Great, thank you, that works!