keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
https://arxiv.org/abs/2301.03580
MIT License
1.42k stars 82 forks source link

vis_reconstruction.py question #14

Closed buxuewushu1314 closed 1 year ago

buxuewushu1314 commented 1 year ago

File "E:/code/SparK-main/SparK-main-latest/pretrain/vis_reconstruction.py", line 55, in build_spark assert len(missing) == 0, f'load_state_dict missing keys: {missing}' AssertionError: load_state_dict missing keys: ['imn_m', 'imn_s', 'norm_black', 'sparse_encoder.sp_cnn.conv1.weight', 'sparse_encoder.sp_cnn.bn1.weight', 'sparse_encoder.sp_cnn.bn1.bias', 'sparse_encoder.sp_cnn.bn1.running_mean', 'sparse_encoder.sp_cnn.bn1.running_var', 'sparse_encoder.sp_cnn.layer1.0.conv1.weight', 'sparse_encoder.sp_cnn.layer1.0.bn1.weight', 'sparse_encoder.sp_cnn.layer1.0.bn1.bias', 'sparse_encoder.sp_cnn.layer1.0.bn1.running_mean', 'sparse_encoder.sp_cnn.layer1.0.bn1.running_var', 'sparse_encoder.sp_cnn.layer1.0.conv2.weight', 'sparse_encoder.sp_cnn.layer1.0.bn2.weight', 'sparse_encoder.sp_cnn.layer1.0.bn2.bias', 'sparse_encoder.sp_cnn.layer1.0.bn2.running_mean', 'sparse_encoder.sp_cnn.layer1.0.bn2.running_var', 'sparse_encoder.sp_cnn.layer1.0.conv3.weight', 'sparse_encoder.sp_cnn.layer1.0.bn3.weight', 'sparse_encoder.sp_cnn.layer1.0.bn3.bias', 'sparse_encoder.sp_cnn.layer1.0.bn3.running_mean', 'sparse_encoder.sp_cnn.layer1.0.bn3.running_var', 'sparse_encoder.sp_cnn.layer1.0.downsample.0.weight', 'sparse_encoder.sp_cnn.layer1.0.downsample.1.weight', 'sparse_encoder.sp_cnn.layer1.0.downsample.1.bias', 'sparse_encoder.sp_cnn.layer1.0.downsample.1.running_mean', 'sparse_encoder.sp_cnn.layer1.0.downsample.1.running_var', 'sparse_encoder.sp_cnn.layer1.1.conv1.weight', 'sparse_encoder.sp_cnn.layer1.1.bn1.weight', 'sparse_encoder.sp_cnn.layer1.1.bn1.bias', 'sparse_encoder.sp_cnn.layer1.1.bn1.running_mean', 'sparse_encoder.sp_cnn.layer1.1.bn1.running_var', 'sparse_encoder.sp_cnn.layer1.1.conv2.weight', 'sparse_encoder.sp_cnn.layer1.1.bn2.weight', 'sparse_encoder.sp_cnn.layer1.1.bn2.bias', 'sparse_encoder.sp_cnn.layer1.1.bn2.running_mean', 'sparse_encoder.sp_cnn.layer1.1.bn2.running_var', 'sparse_encoder.sp_cnn.layer1.1.conv3.weight', 'sparse_encoder.sp_cnn.layer1.1.bn3.weight', 'sparse_encoder.sp_cnn.layer1.1.bn3.bias', 'sparse_encoder.sp_cnn.layer1.1.bn3.running_mean', 'sparse_encoder.sp_cnn.layer1.1.bn3.running_var', 'sparse_encoder.sp_cnn.layer1.2.conv1.weight', 'sparse_encoder.sp_cnn.layer1.2.bn1.weight', 'sparse_encoder.sp_cnn.layer1.2.bn1.bias', 'sparse_encoder.sp_cnn.layer1.2.bn1.running_mean', 'sparse_encoder.sp_cnn.layer1.2.bn1.running_var', 'sparse_encoder.sp_cnn.layer1.2.conv2.weight', 'sparse_encoder.sp_cnn.layer1.2.bn2.weight', 'sparse_encoder.sp_cnn.layer1.2.bn2.bias', 'sparse_encoder.sp_cnn.layer1.2.bn2.running_mean', 'sparse_encoder.sp_cnn.layer1.2.bn2.running_var', 'sparse_encoder.sp_cnn.layer1.2.conv3.weight', 'sparse_encoder.sp_cnn.layer1.2.bn3.weight', 'sparse_encoder.sp_cnn.layer1.2.bn3.bias', 'sparse_encoder.sp_cnn.layer1.2.bn3.running_mean', 'sparse_encoder.sp_cnn.layer1.2.bn3.running_var', 'sparse_encoder.sp_cnn.layer2.0.conv1.weight', 'sparse_encoder.sp_cnn.layer2.0.bn1.weight', 'sparse_encoder.sp_cnn.layer2.0.bn1.bias', 'sparse_encoder.sp_cnn.layer2.0.bn1.running_mean', 'sparse_encoder.sp_cnn.layer2.0.bn1.running_var', 'sparse_encoder.sp_cnn.layer2.0.conv2.weight', 'sparse_encoder.sp_cnn.layer2.0.bn2.weight', 'sparse_encoder.sp_cnn.layer2.0.bn2.bias', 'sparse_encoder.sp_cnn.layer2.0.bn2.running_mean', 'sparse_encoder.sp_cnn.layer2.0.bn2.running_var', 'sparse_encoder.sp_cnn.layer2.0.conv3.weight', 'sparse_encoder.sp_cnn.layer2.0.bn3.weight', 'sparse_encoder.sp_cnn.layer2.0.bn3.bias', 'sparse_encoder.sp_cnn.layer2.0.bn3.running_mean', 'sparse_encoder.sp_cnn.layer2.0.bn3.running_var', 'sparse_encoder.sp_cnn.layer2.0.downsample.0.weight', 'sparse_encoder.sp_cnn.layer2.0.downsample.1.weight', 'sparse_encoder.sp_cnn.layer2.0.downsample.1.bias', 'sparse_encoder.sp_cnn.layer2.0.downsample.1.running_mean', 'sparse_encoder.sp_cnn.layer2.0.downsample.1.running_var', 'sparse_encoder.sp_cnn.layer2.1.conv1.weight', 'sparse_encoder.sp_cnn.layer2.1.bn1.weight', 'sparse_encoder.sp_cnn.layer2.1.bn1.bias', 'sparse_encoder.sp_cnn.layer2.1.bn1.running_mean', 'sparse_encoder.sp_cnn.layer2.1.bn1.running_var', 'sparse_encoder.sp_cnn.layer2.1.conv2.weight', 'sparse_encoder.sp_cnn.layer2.1.bn2.weight', 'sparse_encoder.sp_cnn.layer2.1.bn2.bias', 'sparse_encoder.sp_cnn.layer2.1.bn2.running_mean', 'sparse_encoder.sp_cnn.layer2.1.bn2.running_var', 'sparse_encoder.sp_cnn.layer2.1.conv3.weight', 'sparse_encoder.sp_cnn.layer2.1.bn3.weight', 'sparse_encoder.sp_cnn.layer2.1.bn3.bias', 'sparse_encoder.sp_cnn.layer2.1.bn3.running_mean', 'sparse_encoder.sp_cnn.layer2.1.bn3.running_var', 'sparse_encoder.sp_cnn.layer2.2.conv1.weight', 'sparse_encoder.sp_cnn.layer2.2.bn1.weight', 'sparse_encoder.sp_cnn.layer2.2.bn1.bias', 'sparse_encoder.sp_cnn.layer2.2.bn1.running_mean', 'sparse_encoder.sp_cnn.layer2.2.bn1.running_var', 'sparse_encoder.sp_cnn.layer2.2.conv2.weight', 'sparse_encoder.sp_cnn.layer2.2.bn2.weight', 'sparse_encoder.sp_cnn.layer2.2.bn2.bias', 'sparse_encoder.sp_cnn.layer2.2.bn2.running_mean', 'sparse_encoder.sp_cnn.layer2.2.bn2.running_var', 'sparse_encoder.sp_cnn.layer2.2.conv3.weight', 'sparse_encoder.sp_cnn.layer2.2.bn3.weight', 'sparse_encoder.sp_cnn.layer2.2.bn3.bias', 'sparse_encoder.sp_cnn.layer2.2.bn3.running_mean', 'sparse_encoder.sp_cnn.layer2.2.bn3.running_var', 'sparse_encoder.sp_cnn.layer2.3.conv1.weight', 'sparse_encoder.sp_cnn.layer2.3.bn1.weight', 'sparse_encoder.sp_cnn.layer2.3.bn1.bias', 'sparse_encoder.sp_cnn.layer2.3.bn1.running_mean', 'sparse_encoder.sp_cnn.layer2.3.bn1.running_var', 'sparse_encoder.sp_cnn.layer2.3.conv2.weight', 'sparse_encoder.sp_cnn.layer2.3.bn2.weight', 'sparse_encoder.sp_cnn.layer2.3.bn2.bias', 'sparse_encoder.sp_cnn.layer2.3.bn2.running_mean', 'sparse_encoder.sp_cnn.layer2.3.bn2.running_var', 'sparse_encoder.sp_cnn.layer2.3.conv3.weight', 'sparse_encoder.sp_cnn.layer2.3.bn3.weight', 'sparse_encoder.sp_cnn.layer2.3.bn3.bias', 'sparse_encoder.sp_cnn.layer2.3.bn3.running_mean', 'sparse_encoder.sp_cnn.layer2.3.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.0.conv1.weight', 'sparse_encoder.sp_cnn.layer3.0.bn1.weight', 'sparse_encoder.sp_cnn.layer3.0.bn1.bias', 'sparse_encoder.sp_cnn.layer3.0.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.0.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.0.conv2.weight', 'sparse_encoder.sp_cnn.layer3.0.bn2.weight', 'sparse_encoder.sp_cnn.layer3.0.bn2.bias', 'sparse_encoder.sp_cnn.layer3.0.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.0.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.0.conv3.weight', 'sparse_encoder.sp_cnn.layer3.0.bn3.weight', 'sparse_encoder.sp_cnn.layer3.0.bn3.bias', 'sparse_encoder.sp_cnn.layer3.0.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.0.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.0.downsample.0.weight', 'sparse_encoder.sp_cnn.layer3.0.downsample.1.weight', 'sparse_encoder.sp_cnn.layer3.0.downsample.1.bias', 'sparse_encoder.sp_cnn.layer3.0.downsample.1.running_mean', 'sparse_encoder.sp_cnn.layer3.0.downsample.1.running_var', 'sparse_encoder.sp_cnn.layer3.1.conv1.weight', 'sparse_encoder.sp_cnn.layer3.1.bn1.weight', 'sparse_encoder.sp_cnn.layer3.1.bn1.bias', 'sparse_encoder.sp_cnn.layer3.1.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.1.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.1.conv2.weight', 'sparse_encoder.sp_cnn.layer3.1.bn2.weight', 'sparse_encoder.sp_cnn.layer3.1.bn2.bias', 'sparse_encoder.sp_cnn.layer3.1.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.1.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.1.conv3.weight', 'sparse_encoder.sp_cnn.layer3.1.bn3.weight', 'sparse_encoder.sp_cnn.layer3.1.bn3.bias', 'sparse_encoder.sp_cnn.layer3.1.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.1.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.2.conv1.weight', 'sparse_encoder.sp_cnn.layer3.2.bn1.weight', 'sparse_encoder.sp_cnn.layer3.2.bn1.bias', 'sparse_encoder.sp_cnn.layer3.2.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.2.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.2.conv2.weight', 'sparse_encoder.sp_cnn.layer3.2.bn2.weight', 'sparse_encoder.sp_cnn.layer3.2.bn2.bias', 'sparse_encoder.sp_cnn.layer3.2.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.2.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.2.conv3.weight', 'sparse_encoder.sp_cnn.layer3.2.bn3.weight', 'sparse_encoder.sp_cnn.layer3.2.bn3.bias', 'sparse_encoder.sp_cnn.layer3.2.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.2.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.3.conv1.weight', 'sparse_encoder.sp_cnn.layer3.3.bn1.weight', 'sparse_encoder.sp_cnn.layer3.3.bn1.bias', 'sparse_encoder.sp_cnn.layer3.3.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.3.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.3.conv2.weight', 'sparse_encoder.sp_cnn.layer3.3.bn2.weight', 'sparse_encoder.sp_cnn.layer3.3.bn2.bias', 'sparse_encoder.sp_cnn.layer3.3.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.3.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.3.conv3.weight', 'sparse_encoder.sp_cnn.layer3.3.bn3.weight', 'sparse_encoder.sp_cnn.layer3.3.bn3.bias', 'sparse_encoder.sp_cnn.layer3.3.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.3.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.4.conv1.weight', 'sparse_encoder.sp_cnn.layer3.4.bn1.weight', 'sparse_encoder.sp_cnn.layer3.4.bn1.bias', 'sparse_encoder.sp_cnn.layer3.4.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.4.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.4.conv2.weight', 'sparse_encoder.sp_cnn.layer3.4.bn2.weight', 'sparse_encoder.sp_cnn.layer3.4.bn2.bias', 'sparse_encoder.sp_cnn.layer3.4.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.4.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.4.conv3.weight', 'sparse_encoder.sp_cnn.layer3.4.bn3.weight', 'sparse_encoder.sp_cnn.layer3.4.bn3.bias', 'sparse_encoder.sp_cnn.layer3.4.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.4.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.5.conv1.weight', 'sparse_encoder.sp_cnn.layer3.5.bn1.weight', 'sparse_encoder.sp_cnn.layer3.5.bn1.bias', 'sparse_encoder.sp_cnn.layer3.5.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.5.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.5.conv2.weight', 'sparse_encoder.sp_cnn.layer3.5.bn2.weight', 'sparse_encoder.sp_cnn.layer3.5.bn2.bias', 'sparse_encoder.sp_cnn.layer3.5.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.5.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.5.conv3.weight', 'sparse_encoder.sp_cnn.layer3.5.bn3.weight', 'sparse_encoder.sp_cnn.layer3.5.bn3.bias', 'sparse_encoder.sp_cnn.layer3.5.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.5.bn3.running_var', 'sparse_encoder.sp_cnn.layer4.0.conv1.weight', 'sparse_encoder.sp_cnn.layer4.0.bn1.weight', 'sparse_encoder.sp_cnn.layer4.0.bn1.bias', 'sparse_encoder.sp_cnn.layer4.0.bn1.running_mean', 'sparse_encoder.sp_cnn.layer4.0.bn1.running_var', 'sparse_encoder.sp_cnn.layer4.0.conv2.weight', 'sparse_encoder.sp_cnn.layer4.0.bn2.weight', 'sparse_encoder.sp_cnn.layer4.0.bn2.bias', 'sparse_encoder.sp_cnn.layer4.0.bn2.running_mean', 'sparse_encoder.sp_cnn.layer4.0.bn2.running_var', 'sparse_encoder.sp_cnn.layer4.0.conv3.weight', 'sparse_encoder.sp_cnn.layer4.0.bn3.weight', 'sparse_encoder.sp_cnn.layer4.0.bn3.bias', 'sparse_encoder.sp_cnn.layer4.0.bn3.running_mean', 'sparse_encoder.sp_cnn.layer4.0.bn3.running_var', 'sparse_encoder.sp_cnn.layer4.0.downsample.0.weight', 'sparse_encoder.sp_cnn.layer4.0.downsample.1.weight', 'sparse_encoder.sp_cnn.layer4.0.downsample.1.bias', 'sparse_encoder.sp_cnn.layer4.0.downsample.1.running_mean', 'sparse_encoder.sp_cnn.layer4.0.downsample.1.running_var', 'sparse_encoder.sp_cnn.layer4.1.conv1.weight', 'sparse_encoder.sp_cnn.layer4.1.bn1.weight', 'sparse_encoder.sp_cnn.layer4.1.bn1.bias', 'sparse_encoder.sp_cnn.layer4.1.bn1.running_mean', 'sparse_encoder.sp_cnn.layer4.1.bn1.running_var', 'sparse_encoder.sp_cnn.layer4.1.conv2.weight', 'sparse_encoder.sp_cnn.layer4.1.bn2.weight', 'sparse_encoder.sp_cnn.layer4.1.bn2.bias', 'sparse_encoder.sp_cnn.layer4.1.bn2.running_mean', 'sparse_encoder.sp_cnn.layer4.1.bn2.running_var', 'sparse_encoder.sp_cnn.layer4.1.conv3.weight', 'sparse_encoder.sp_cnn.layer4.1.bn3.weight', 'sparse_encoder.sp_cnn.layer4.1.bn3.bias', 'sparse_encoder.sp_cnn.layer4.1.bn3.running_mean', 'sparse_encoder.sp_cnn.layer4.1.bn3.running_var', 'sparse_encoder.sp_cnn.layer4.2.conv1.weight', 'sparse_encoder.sp_cnn.layer4.2.bn1.weight', 'sparse_encoder.sp_cnn.layer4.2.bn1.bias', 'sparse_encoder.sp_cnn.layer4.2.bn1.running_mean', 'sparse_encoder.sp_cnn.layer4.2.bn1.running_var', 'sparse_encoder.sp_cnn.layer4.2.conv2.weight', 'sparse_encoder.sp_cnn.layer4.2.bn2.weight', 'sparse_encoder.sp_cnn.layer4.2.bn2.bias', 'sparse_encoder.sp_cnn.layer4.2.bn2.running_mean', 'sparse_encoder.sp_cnn.layer4.2.bn2.running_var', 'sparse_encoder.sp_cnn.layer4.2.conv3.weight', 'sparse_encoder.sp_cnn.layer4.2.bn3.weight', 'sparse_encoder.sp_cnn.layer4.2.bn3.bias', 'sparse_encoder.sp_cnn.layer4.2.bn3.running_mean', 'sparse_encoder.sp_cnn.layer4.2.bn3.running_var', 'dense_decoder.dec.0.up_sample.weight', 'dense_decoder.dec.0.up_sample.bias', 'dense_decoder.dec.0.conv.0.weight', 'dense_decoder.dec.0.conv.1.weight', 'dense_decoder.dec.0.conv.1.bias', 'dense_decoder.dec.0.conv.1.running_mean', 'dense_decoder.dec.0.conv.1.running_var', 'dense_decoder.dec.0.conv.3.weight', 'dense_decoder.dec.0.conv.4.weight', 'dense_decoder.dec.0.conv.4.bias', 'dense_decoder.dec.0.conv.4.running_mean', 'dense_decoder.dec.0.conv.4.running_var', 'dense_decoder.dec.1.up_sample.weight', 'dense_decoder.dec.1.up_sample.bias', 'dense_decoder.dec.1.conv.0.weight', 'dense_decoder.dec.1.conv.1.weight', 'dense_decoder.dec.1.conv.1.bias', 'dense_decoder.dec.1.conv.1.running_mean', 'dense_decoder.dec.1.conv.1.running_var', 'dense_decoder.dec.1.conv.3.weight', 'dense_decoder.dec.1.conv.4.weight', 'dense_decoder.dec.1.conv.4.bias', 'dense_decoder.dec.1.conv.4.running_mean', 'dense_decoder.dec.1.conv.4.running_var', 'dense_decoder.dec.2.up_sample.weight', 'dense_decoder.dec.2.up_sample.bias', 'dense_decoder.dec.2.conv.0.weight', 'dense_decoder.dec.2.conv.1.weight', 'dense_decoder.dec.2.conv.1.bias', 'dense_decoder.dec.2.conv.1.running_mean', 'dense_decoder.dec.2.conv.1.running_var', 'dense_decoder.dec.2.conv.3.weight', 'dense_decoder.dec.2.conv.4.weight', 'dense_decoder.dec.2.conv.4.bias', 'dense_decoder.dec.2.conv.4.running_mean', 'dense_decoder.dec.2.conv.4.running_var', 'dense_decoder.dec.3.up_sample.weight', 'dense_decoder.dec.3.up_sample.bias', 'dense_decoder.dec.3.conv.0.weight', 'dense_decoder.dec.3.conv.1.weight', 'dense_decoder.dec.3.conv.1.bias', 'dense_decoder.dec.3.conv.1.running_mean', 'dense_decoder.dec.3.conv.1.running_var', 'dense_decoder.dec.3.conv.3.weight', 'dense_decoder.dec.3.conv.4.weight', 'dense_decoder.dec.3.conv.4.bias', 'dense_decoder.dec.3.conv.4.running_mean', 'dense_decoder.dec.3.conv.4.running_var', 'dense_decoder.dec.4.up_sample.weight', 'dense_decoder.dec.4.up_sample.bias', 'dense_decoder.dec.4.conv.0.weight', 'dense_decoder.dec.4.conv.1.weight', 'dense_decoder.dec.4.conv.1.bias', 'dense_decoder.dec.4.conv.1.running_mean', 'dense_decoder.dec.4.conv.1.running_var', 'dense_decoder.dec.4.conv.3.weight', 'dense_decoder.dec.4.conv.4.weight', 'dense_decoder.dec.4.conv.4.bias', 'dense_decoder.dec.4.conv.4.running_mean', 'dense_decoder.dec.4.conv.4.running_var', 'dense_decoder.proj.weight', 'dense_decoder.proj.bias', 'densify_norms.0.weight', 'densify_norms.0.bias', 'densify_norms.1.weight', 'densify_norms.1.bias', 'densify_norms.2.weight', 'densify_norms.2.bias', 'densify_norms.3.weight', 'densify_norms.3.bias', 'densify_projs.0.weight', 'densify_projs.0.bias', 'densify_projs.1.weight', 'densify_projs.1.bias', 'densify_projs.2.weight', 'densify_projs.2.bias', 'densify_projs.3.weight', 'densify_projs.3.bias', 'mask_tokens.0', 'mask_tokens.1', 'mask_tokens.2', 'mask_tokens.3']

keyu-tian commented 1 year ago

Hi @buxuewushu1314 I just tested the viz code, but did not encounter this problem. Could you double-check whether you download the cnxL384_withdecoder_1kpretrained_spark_style.pth file from https://drive.google.com/file/d/1ZI9Jgtb3fKWE_vDFEly29w-1FWZSNwa0/view?usp=share_link ? I think checking the file will solve the issue.

syjabc commented 1 year ago

When I load xxx__1kpretrained.pth or xxxx_still_pretraining.pth and want to have a look at the performance of my pretrained model, I encounter this problem too. How can I get my xxxx_withdecoder_1kpretrained_spark_style.pth when i pretraining ?

syjabc commented 1 year ago

I modified the function:build_spark like below def build_spark(pretraining_pth,spark_model): pretrained_state = torch.load(pretraining_pth, map_location='cpu') spark_model.eval(), [p.requires_grad_(False) for p in spark_model.parameters()] # load the checkpoint missing, unexpected = spark_model.load_state_dict(pretrained_state['module'], strict=False) assert len(missing) == 0, f'load_state_dict missing keys: {missing}' assert len(unexpected) == 0, f'load_state_dict unexpected keys: {unexpected}' del pretrained_state return spark

And then it works using xxxx_still_pretraining.pth ckpt_file = 'your xxxx_still_pretraining.pth path' spark = 'your spark_model' spark = build_spark(ckpt_file,spark)

What's more, you may need to modify the function denorm_for_vis in spark.py : 191 if you use your own dataset

keyu-tian commented 1 year ago

@syjabc have you solved the issue through your code? I think the new build_spark is doing the right thing, using torch.load('xxxx_still_pretraining', 'cpu')['module'] as torch.load('xxxx_withdecoder_1kpretrained_spark_style', 'cpu').

syjabc commented 1 year ago

I think it has been solved. Learned so much from your code, thanks for your excellent work.

keyu-tian commented 1 year ago

Also thank you for helping me to refine this visualisation. In the latest commit 1468df8 i modify the notebook for visualizing your own model.