Converting IR-CSN-50 Caffe2 to PyTorch + more IR-CSN backbone sizes

tchang1997 commented 3 years ago

Problem: Can't convert IR-CSN-50 Caffe2 model to PyTorch using modified version of conversion script. Expected behavior: Caffe2 model is converted to .pth format successfully Actual behavior: Shape mismatch occurs.

Questions:

Is there something wrong with the *.pkl file storing the IR-CSN-50 model weights, or have I misspecified something?
Alternately (circumventing this entirely), are there pre-trained versions of IR-CSN-50 (or other backbone sizes, aside from 152) that are PyTorch-ready, that we can use?

Details: I'm trying to convert the Caffe2 IR-CSN-50 checkpoint given here into the equivalent PyTorch model. I've run the conversion script on one of the provided IR-CSN-152 checkpoints successfully, but I can't seem to make the IR-CSN-50 conversion work.

I added the following code to utilities/model_conversion/conversion_models.py:

def ir_csn_50(pretrained=False, progress=False, **kwargs):
    model = _video_resnet("ir_csn_50",
                           False,
                           False,
                           block=Bottleneck,
                           conv_makers=[Conv3DDepthwise] * 4,
                           layers=[3, 8, 6, 3],
                           stem=BasicStem_Pool,
                           **kwargs)  
    for m in model.modules(): 
        if isinstance(m, nn.BatchNorm3d): 
            m.eps = 1e-3
    return model

This is directly based on the ir_csn_152 function, replacing terms where necessary. I got the layer sizes from the ip_csn_50 function. I also added ir_csn_50 to __all__.

However, when I run the following command (based on these commands) to convert the model...

python utilities/model_conversion/convert_models.py \
--pkl /path/to/my/models/kinetics400/irCSN_50_ft_kinetics_from_ig65m_f233743920.pkl \ 
--out /path/to/my/models/kinetics400/irCSN_50_ft_kinetics_from_ig65m_f233743920 \
--model "ir_csn_50" \ 
--frames 32  --inputsize 224 --classes 400 \

I run into the following error:

Traceback (most recent call last):                                                                                                                                                                                                             File "utilities/model_conversion/convert_models.py", line 336, in <module>
    main(args)
File "utilities/model_conversion/convert_models.py", line 277, in main
    i = copy_layer(model.layer2, blobs, I)
File "utilities/model_conversion/convert_models.py", line 209, in copy_layer
    copy_bottleneck(curr_block, blobs, I)
File "utilities/model_conversion/convert_models.py", line 134, in copy_bottleneck
    copy_conv3d(module.conv1, blobs, i, 1)
File "utilities/model_conversion/convert_models.py", line 73, in copy_conv3d
    copy_conv(module[0], blobs, "comp_" + str(i) + "_conv_" + str(j))
File "utilities/model_conversion/convert_models.py", line 59, in copy_conv
    copy_tensor(module.weight.data, blobs, prefix + "_w")
File "utilities/model_conversion/convert_models.py", line 50, in copy_tensor
    assert data.size() == tensor.size()
AssertionError

Upon further inspection, the blob named comp_7_conv_1_w has size [256, 512, 1, 1, 1], but it is expected to have size [128, 512, 1, 1, 1].

Thanks in advance!

tchang1997 commented 3 years ago

Update: for converting the models, the Resnet50 block sizes should be [3, 4, 6, 3], NOT [3, 8, 6, 3]. The conversion works upon modifying the ir_csn_50 code above accordingly.

nhatnxn commented 3 years ago

I have the same problem. It succeeds for irCSN_152, but it doesn't work for irCSN_50. KeyError: 'comp_0_conv_2_middle_w'.

facebookresearch / VMZ

Converting IR-CSN-50 Caffe2 to PyTorch + more IR-CSN backbone sizes #128