keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
https://arxiv.org/abs/2301.03580
MIT License
1.41k stars 82 forks source link

Resuming ImageNet fine-tuning #72

Closed ds2268 closed 4 months ago

ds2268 commented 7 months ago

I am running jobs on an HPC with a 2-day reservation limitation. I am having trouble resuming downstream_imagenet (bellow).

I cannot resume a pre-trained and saved model during fine-tuning (latest model).

Any solution @keyu-tian ?

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ModelEmaV2:
    Missing key(s) in state_dict: "module.downsample_layers.0.0.weight", "module.downsample_layers.0.0.bias", "module.downsample_layers.0.1.weight", "module.downsample_layers.0.1.bias", "module.downsample_layers.1.0.weight", "module.downsample_layers.1.0.bias", "module.downsample_layers.1.1.weight", "module.downsample_layers.1.1.bias", "module.downsample_layers.2.0.weight", "module.downsample_layers.2.0.bias", "module.downsample_layers.2.1.weight", "module.downsample_layers.2.1.bias", "module.downsample_layers.3.0.weight", "module.downsample_layers.3.0.bias", "module.downsample_layers.3.1.weight", "module.downsample_layers.3.1.bias", "module.stages.0.0.gamma", "module.stages.0.0.dwconv.weight", "module.stages.0.0.dwconv.bias", "module.stages.0.0.norm.weight", "module.stages.0.0.norm.bias", "module.stages.0.0.pwconv1.weight", "module.stages.0.0.pwconv1.bias", "module.stages.0.0.pwconv2.weight", "module.stages.0.0.pwconv2.bias", "module.stages.0.1.gamma", "module.stages.0.1.dwconv.weight", "module.stages.0.1.dwconv.bias", "module.stages.0.1.norm.weight", "module.stages.0.1.norm.bias", "module.stages.0.1.pwconv1.weight", "module.stages.0.1.pwconv1.bias", "module.stages.0.1.pwconv2.weight", "module.stages.0.1.pwconv2.bias", "module.stages.0.2.gamma", "module.stages.0.2.dwconv.weight", "module.stages.0.2.dwconv.bias", "module.stages.0.2.norm.weight", "module.stages.0.2.norm.bias", "module.stages.0.2.pwconv1.weight", "module.stages.0.2.pwconv1.bias", "module.stages.0.2.pwconv2.weight", "module.stages.0.2.pwconv2.bias", "module.stages.1.0.gamma", "module.stages.1.0.dwconv.weight", "module.stages.1.0.dwconv.bias", "module.stages.1.0.norm.weight", "module.stages.1.0.norm.bias", "module.stages.1.0.pwconv1.weight", "module.stages.1.0.pwconv1.bias", "module.stages.1.0.pwconv2.weight", "module.stages.1.0.pwconv2.bias", "module.stages.1.1.gamma", "module.stages.1.1.dwconv.weight", "module.stages.1.1.dwconv.bias", "module.stages.1.1.norm.weight", "module.stages.1.1.norm.bias", "module.stages.1.1.pwconv1.weight", "module.stages.1.1.pwconv1.bias", "module.stages.1.1.pwconv2.weight", "module.stages.1.1.pwconv2.bias", "module.stages.1.2.gamma", "module.stages.1.2.dwconv.weight", "module.stages.1.2.dwconv.bias", "module.stages.1.2.norm.weight", "module.stages.1.2.norm.bias", "module.stages.1.2.pwconv1.weight", "module.stages.1.2.pwconv1.bias", "module.stages.1.2.pwconv2.weight", "module.stages.1.2.pwconv2.bias", "module.stages.2.0.gamma", "module.stages.2.0.dwconv.weight", "module.stages.2.0.dwconv.bias", "module.stages.2.0.norm.weight", "module.stages.2.0.norm.bias", "module.stages.2.0.pwconv1.weight", "module.stages.2.0.pwconv1.bias", "module.stages.2.0.pwconv2.weight", "module.stages.2.0.pwconv2.bias", "module.stages.2.1.gamma", "module.stages.2.1.dwconv.weight", "module.stages.2.1.dwconv.bias", "module.stages.2.1.norm.weight", "module.stages.2.1.norm.bias", "module.stages.2.1.pwconv1.weight", "module.stages.2.1.pwconv1.bias", "module.stages.2.1.pwconv2.weight", "module.stages.2.1.pwconv2.bias", "module.stages.2.2.gamma", "module.stages.2.2.dwconv.weight", "module.stages.2.2.dwconv.bias", "module.stages.2.2.norm.weight", "module.stages.2.2.norm.bias", "module.stages.2.2.pwconv1.weight", "module.stages.2.2.pwconv1.bias", "module.stages.2.2.pwconv2.weight", "module.stages.2.2.pwconv2.bias", "module.stages.2.3.gamma", "module.stages.2.3.dwconv.weight", "module.stages.2.3.dwconv.bias", "module.stages.2.3.norm.weight", "module.stages.2.3.norm.bias", "module.stages.2.3.pwconv1.weight", "module.stages.2.3.pwconv1.bias", "module.stages.2.3.pwconv2.weight", "module.stages.2.3.pwconv2.bias", "module.stages.2.4.gamma", "module.stages.2.4.dwconv.weight", "module.stages.2.4.dwconv.bias", "module.stages.2.4.norm.weight", "module.stages.2.4.norm.bias", "module.stages.2.4.pwconv1.weight", "module.stages.2.4.pwconv1.bias", "module.stages.2.4.pwconv2.weight", "module.stages.2.4.pwconv2.bias", "module.stages.2.5.gamma", "module.stages.2.5.dwconv.weight", "module.stages.2.5.dwconv.bias", "module.stages.2.5.norm.weight", "module.stages.2.5.norm.bias", "module.stages.2.5.pwconv1.weight", "module.stages.2.5.pwconv1.bias", "module.stages.2.5.pwconv2.weight", "module.stages.2.5.pwconv2.bias", "module.stages.2.6.gamma", "module.stages.2.6.dwconv.weight", "module.stages.2.6.dwconv.bias", "module.stages.2.6.norm.weight", "module.stages.2.6.norm.bias", "module.stages.2.6.pwconv1.weight", "module.stages.2.6.pwconv1.bias", "module.stages.2.6.pwconv2.weight", "module.stages.2.6.pwconv2.bias", "module.stages.2.7.gamma", "module.stages.2.7.dwconv.weight", "module.stages.2.7.dwconv.bias", "module.stages.2.7.norm.weight", "module.stages.2.7.norm.bias", "module.stages.2.7.pwconv1.weight", "module.stages.2.7.pwconv1.bias", "module.stages.2.7.pwconv2.weight", "module.stages.2.7.pwconv2.bias", "module.stages.2.8.gamma", "module.stages.2.8.dwconv.weight", "module.stages.2.8.dwconv.bias", "module.stages.2.8.norm.weight", "module.stages.2.8.norm.bias", "module.stages.2.8.pwconv1.weight", "module.stages.2.8.pwconv1.bias", "module.stages.2.8.pwconv2.weight", "module.stages.2.8.pwconv2.bias", "module.stages.2.9.gamma", "module.stages.2.9.dwconv.weight", "module.stages.2.9.dwconv.bias", "module.stages.2.9.norm.weight", "module.stages.2.9.norm.bias", "module.stages.2.9.pwconv1.weight", "module.stages.2.9.pwconv1.bias", "module.stages.2.9.pwconv2.weight", "module.stages.2.9.pwconv2.bias", "module.stages.2.10.gamma", "module.stages.2.10.dwconv.weight", "module.stages.2.10.dwconv.bias", "module.stages.2.10.norm.weight", "module.stages.2.10.norm.bias", "module.stages.2.10.pwconv1.weight", "module.stages.2.10.pwconv1.bias", "module.stages.2.10.pwconv2.weight", "module.stages.2.10.pwconv2.bias", "module.stages.2.11.gamma", "module.stages.2.11.dwconv.weight", "module.stages.2.11.dwconv.bias", "module.stages.2.11.norm.weight", "module.stages.2.11.norm.bias", "module.stages.2.11.pwconv1.weight", "module.stages.2.11.pwconv1.bias", "module.stages.2.11.pwconv2.weight", "module.stages.2.11.pwconv2.bias", "module.stages.2.12.gamma", "module.stages.2.12.dwconv.weight", "module.stages.2.12.dwconv.bias", "module.stages.2.12.norm.weight", "module.stages.2.12.norm.bias", "module.stages.2.12.pwconv1.weight", "module.stages.2.12.pwconv1.bias", "module.stages.2.12.pwconv2.weight", "module.stages.2.12.pwconv2.bias", "module.stages.2.13.gamma", "module.stages.2.13.dwconv.weight", "module.stages.2.13.dwconv.bias", "module.stages.2.13.norm.weight", "module.stages.2.13.norm.bias", "module.stages.2.13.pwconv1.weight", "module.stages.2.13.pwconv1.bias", "module.stages.2.13.pwconv2.weight", "module.stages.2.13.pwconv2.bias", "module.stages.2.14.gamma", "module.stages.2.14.dwconv.weight", "module.stages.2.14.dwconv.bias", "module.stages.2.14.norm.weight", "module.stages.2.14.norm.bias", "module.stages.2.14.pwconv1.weight", "module.stages.2.14.pwconv1.bias", "module.stages.2.14.pwconv2.weight", "module.stages.2.14.pwconv2.bias", "module.stages.2.15.gamma", "module.stages.2.15.dwconv.weight", "module.stages.2.15.dwconv.bias", "module.stages.2.15.norm.weight", "module.stages.2.15.norm.bias", "module.stages.2.15.pwconv1.weight", "module.stages.2.15.pwconv1.bias", "module.stages.2.15.pwconv2.weight", "module.stages.2.15.pwconv2.bias", "module.stages.2.16.gamma", "module.stages.2.16.dwconv.weight", "module.stages.2.16.dwconv.bias", "module.stages.2.16.norm.weight", "module.stages.2.16.norm.bias", "module.stages.2.16.pwconv1.weight", "module.stages.2.16.pwconv1.bias", "module.stages.2.16.pwconv2.weight", "module.stages.2.16.pwconv2.bias", "module.stages.2.17.gamma", "module.stages.2.17.dwconv.weight", "module.stages.2.17.dwconv.bias", "module.stages.2.17.norm.weight", "module.stages.2.17.norm.bias", "module.stages.2.17.pwconv1.weight", "module.stages.2.17.pwconv1.bias", "module.stages.2.17.pwconv2.weight", "module.stages.2.17.pwconv2.bias", "module.stages.2.18.gamma", "module.stages.2.18.dwconv.weight", "module.stages.2.18.dwconv.bias", "module.stages.2.18.norm.weight", "module.stages.2.18.norm.bias", "module.stages.2.18.pwconv1.weight", "module.stages.2.18.pwconv1.bias", "module.stages.2.18.pwconv2.weight", "module.stages.2.18.pwconv2.bias", "module.stages.2.19.gamma", "module.stages.2.19.dwconv.weight", "module.stages.2.19.dwconv.bias", "module.stages.2.19.norm.weight", "module.stages.2.19.norm.bias", "module.stages.2.19.pwconv1.weight", "module.stages.2.19.pwconv1.bias", "module.stages.2.19.pwconv2.weight", "module.stages.2.19.pwconv2.bias", "module.stages.2.20.gamma", "module.stages.2.20.dwconv.weight", "module.stages.2.20.dwconv.bias", "module.stages.2.20.norm.weight", "module.stages.2.20.norm.bias", "module.stages.2.20.pwconv1.weight", "module.stages.2.20.pwconv1.bias", "module.stages.2.20.pwconv2.weight", "module.stages.2.20.pwconv2.bias", "module.stages.2.21.gamma", "module.stages.2.21.dwconv.weight", "module.stages.2.21.dwconv.bias", "module.stages.2.21.norm.weight", "module.stages.2.21.norm.bias", "module.stages.2.21.pwconv1.weight", "module.stages.2.21.pwconv1.bias", "module.stages.2.21.pwconv2.weight", "module.stages.2.21.pwconv2.bias", "module.stages.2.22.gamma", "module.stages.2.22.dwconv.weight", "module.stages.2.22.dwconv.bias", "module.stages.2.22.norm.weight", "module.stages.2.22.norm.bias", "module.stages.2.22.pwconv1.weight", "module.stages.2.22.pwconv1.bias", "module.stages.2.22.pwconv2.weight", "module.stages.2.22.pwconv2.bias", "module.stages.2.23.gamma", "module.stages.2.23.dwconv.weight", "module.stages.2.23.dwconv.bias", "module.stages.2.23.norm.weight", "module.stages.2.23.norm.bias", "module.stages.2.23.pwconv1.weight", "module.stages.2.23.pwconv1.bias", "module.stages.2.23.pwconv2.weight", "module.stages.2.23.pwconv2.bias", "module.stages.2.24.gamma", "module.stages.2.24.dwconv.weight", "module.stages.2.24.dwconv.bias", "module.stages.2.24.norm.weight", "module.stages.2.24.norm.bias", "module.stages.2.24.pwconv1.weight", "module.stages.2.24.pwconv1.bias", "module.stages.2.24.pwconv2.weight", "module.stages.2.24.pwconv2.bias", "module.stages.2.25.gamma", "module.stages.2.25.dwconv.weight", "module.stages.2.25.dwconv.bias", "module.stages.2.25.norm.weight", "module.stages.2.25.norm.bias", "module.stages.2.25.pwconv1.weight", "module.stages.2.25.pwconv1.bias", "module.stages.2.25.pwconv2.weight", "module.stages.2.25.pwconv2.bias", "module.stages.2.26.gamma", "module.stages.2.26.dwconv.weight", "module.stages.2.26.dwconv.bias", "module.stages.2.26.norm.weight", "module.stages.2.26.norm.bias", "module.stages.2.26.pwconv1.weight", "module.stages.2.26.pwconv1.bias", "module.stages.2.26.pwconv2.weight", "module.stages.2.26.pwconv2.bias", "module.stages.3.0.gamma", "module.stages.3.0.dwconv.weight", "module.stages.3.0.dwconv.bias", "module.stages.3.0.norm.weight", "module.stages.3.0.norm.bias", "module.stages.3.0.pwconv1.weight", "module.stages.3.0.pwconv1.bias", "module.stages.3.0.pwconv2.weight", "module.stages.3.0.pwconv2.bias", "module.stages.3.1.gamma", "module.stages.3.1.dwconv.weight", "module.stages.3.1.dwconv.bias", "module.stages.3.1.norm.weight", "module.stages.3.1.norm.bias", "module.stages.3.1.pwconv1.weight", "module.stages.3.1.pwconv1.bias", "module.stages.3.1.pwconv2.weight", "module.stages.3.1.pwconv2.bias", "module.stages.3.2.gamma", "module.stages.3.2.dwconv.weight", "module.stages.3.2.dwconv.bias", "module.stages.3.2.norm.weight", "module.stages.3.2.norm.bias", "module.stages.3.2.pwconv1.weight", "module.stages.3.2.pwconv1.bias", "module.stages.3.2.pwconv2.weight", "module.stages.3.2.pwconv2.bias", "module.norm.weight", "module.norm.bias", "module.head.weight", "module.head.bias". 
    Unexpected key(s) in state_dict: "downsample_layers.0.0.weight", "downsample_layers.0.0.bias", "downsample_layers.0.1.weight", "downsample_layers.0.1.bias", "downsample_layers.1.0.weight", "downsample_layers.1.0.bias", "downsample_layers.1.1.weight", "downsample_layers.1.1.bias", "downsample_layers.2.0.weight", "downsample_layers.2.0.bias", "downsample_layers.2.1.weight", "downsample_layers.2.1.bias", "downsample_layers.3.0.weight", "downsample_layers.3.0.bias", "downsample_layers.3.1.weight", "downsample_layers.3.1.bias", "stages.0.0.gamma", "stages.0.0.dwconv.weight", "stages.0.0.dwconv.bias", "stages.0.0.norm.weight", "stages.0.0.norm.bias", "stages.0.0.pwconv1.weight", "stages.0.0.pwconv1.bias", "stages.0.0.pwconv2.weight", "stages.0.0.pwconv2.bias", "stages.0.1.gamma", "stages.0.1.dwconv.weight", "stages.0.1.dwconv.bias", "stages.0.1.norm.weight", "stages.0.1.norm.bias", "stages.0.1.pwconv1.weight", "stages.0.1.pwconv1.bias", "stages.0.1.pwconv2.weight", "stages.0.1.pwconv2.bias", "stages.0.2.gamma", "stages.0.2.dwconv.weight", "stages.0.2.dwconv.bias", "stages.0.2.norm.weight", "stages.0.2.norm.bias", "stages.0.2.pwconv1.weight", "stages.0.2.pwconv1.bias", "stages.0.2.pwconv2.weight", "stages.0.2.pwconv2.bias", "stages.1.0.gamma", "stages.1.0.dwconv.weight", "stages.1.0.dwconv.bias", "stages.1.0.norm.weight", "stages.1.0.norm.bias", "stages.1.0.pwconv1.weight", "stages.1.0.pwconv1.bias", "stages.1.0.pwconv2.weight", "stages.1.0.pwconv2.bias", "stages.1.1.gamma", "stages.1.1.dwconv.weight", "stages.1.1.dwconv.bias", "stages.1.1.norm.weight", "stages.1.1.norm.bias", "stages.1.1.pwconv1.weight", "stages.1.1.pwconv1.bias", "stages.1.1.pwconv2.weight", "stages.1.1.pwconv2.bias", "stages.1.2.gamma", "stages.1.2.dwconv.weight", "stages.1.2.dwconv.bias", "stages.1.2.norm.weight", "stages.1.2.norm.bias", "stages.1.2.pwconv1.weight", "stages.1.2.pwconv1.bias", "stages.1.2.pwconv2.weight", "stages.1.2.pwconv2.bias", "stages.2.0.gamma", "stages.2.0.dwconv.weight", "stages.2.0.dwconv.bias", "stages.2.0.norm.weight", "stages.2.0.norm.bias", "stages.2.0.pwconv1.weight", "stages.2.0.pwconv1.bias", "stages.2.0.pwconv2.weight", "stages.2.0.pwconv2.bias", "stages.2.1.gamma", "stages.2.1.dwconv.weight", "stages.2.1.dwconv.bias", "stages.2.1.norm.weight", "stages.2.1.norm.bias", "stages.2.1.pwconv1.weight", "stages.2.1.pwconv1.bias", "stages.2.1.pwconv2.weight", "stages.2.1.pwconv2.bias", "stages.2.2.gamma", "stages.2.2.dwconv.weight", "stages.2.2.dwconv.bias", "stages.2.2.norm.weight", "stages.2.2.norm.bias", "stages.2.2.pwconv1.weight", "stages.2.2.pwconv1.bias", "stages.2.2.pwconv2.weight", "stages.2.2.pwconv2.bias", "stages.2.3.gamma", "stages.2.3.dwconv.weight", "stages.2.3.dwconv.bias", "stages.2.3.norm.weight", "stages.2.3.norm.bias", "stages.2.3.pwconv1.weight", "stages.2.3.pwconv1.bias", "stages.2.3.pwconv2.weight", "stages.2.3.pwconv2.bias", "stages.2.4.gamma", "stages.2.4.dwconv.weight", "stages.2.4.dwconv.bias", "stages.2.4.norm.weight", "stages.2.4.norm.bias", "stages.2.4.pwconv1.weight", "stages.2.4.pwconv1.bias", "stages.2.4.pwconv2.weight", "stages.2.4.pwconv2.bias", "stages.2.5.gamma", "stages.2.5.dwconv.weight", "stages.2.5.dwconv.bias", "stages.2.5.norm.weight", "stages.2.5.norm.bias", "stages.2.5.pwconv1.weight", "stages.2.5.pwconv1.bias", "stages.2.5.pwconv2.weight", "stages.2.5.pwconv2.bias", "stages.2.6.gamma", "stages.2.6.dwconv.weight", "stages.2.6.dwconv.bias", "stages.2.6.norm.weight", "stages.2.6.norm.bias", "stages.2.6.pwconv1.weight", "stages.2.6.pwconv1.bias", "stages.2.6.pwconv2.weight", "stages.2.6.pwconv2.bias", "stages.2.7.gamma", "stages.2.7.dwconv.weight", "stages.2.7.dwconv.bias", "stages.2.7.norm.weight", "stages.2.7.norm.bias", "stages.2.7.pwconv1.weight", "stages.2.7.pwconv1.bias", "stages.2.7.pwconv2.weight", "stages.2.7.pwconv2.bias", "stages.2.8.gamma", "stages.2.8.dwconv.weight", "stages.2.8.dwconv.bias", "stages.2.8.norm.weight", "stages.2.8.norm.bias", "stages.2.8.pwconv1.weight", "stages.2.8.pwconv1.bias", "stages.2.8.pwconv2.weight", "stages.2.8.pwconv2.bias", "stages.2.9.gamma", "stages.2.9.dwconv.weight", "stages.2.9.dwconv.bias", "stages.2.9.norm.weight", "stages.2.9.norm.bias", "stages.2.9.pwconv1.weight", "stages.2.9.pwconv1.bias", "stages.2.9.pwconv2.weight", "stages.2.9.pwconv2.bias", "stages.2.10.gamma", "stages.2.10.dwconv.weight", "stages.2.10.dwconv.bias", "stages.2.10.norm.weight", "stages.2.10.norm.bias", "stages.2.10.pwconv1.weight", "stages.2.10.pwconv1.bias", "stages.2.10.pwconv2.weight", "stages.2.10.pwconv2.bias", "stages.2.11.gamma", "stages.2.11.dwconv.weight", "stages.2.11.dwconv.bias", "stages.2.11.norm.weight", "stages.2.11.norm.bias", "stages.2.11.pwconv1.weight", "stages.2.11.pwconv1.bias", "stages.2.11.pwconv2.weight", "stages.2.11.pwconv2.bias", "stages.2.12.gamma", "stages.2.12.dwconv.weight", "stages.2.12.dwconv.bias", "stages.2.12.norm.weight", "stages.2.12.norm.bias", "stages.2.12.pwconv1.weight", "stages.2.12.pwconv1.bias", "stages.2.12.pwconv2.weight", "stages.2.12.pwconv2.bias", "stages.2.13.gamma", "stages.2.13.dwconv.weight", "stages.2.13.dwconv.bias", "stages.2.13.norm.weight", "stages.2.13.norm.bias", "stages.2.13.pwconv1.weight", "stages.2.13.pwconv1.bias", "stages.2.13.pwconv2.weight", "stages.2.13.pwconv2.bias", "stages.2.14.gamma", "stages.2.14.dwconv.weight", "stages.2.14.dwconv.bias", "stages.2.14.norm.weight", "stages.2.14.norm.bias", "stages.2.14.pwconv1.weight", "stages.2.14.pwconv1.bias", "stages.2.14.pwconv2.weight", "stages.2.14.pwconv2.bias", "stages.2.15.gamma", "stages.2.15.dwconv.weight", "stages.2.15.dwconv.bias", "stages.2.15.norm.weight", "stages.2.15.norm.bias", "stages.2.15.pwconv1.weight", "stages.2.15.pwconv1.bias", "stages.2.15.pwconv2.weight", "stages.2.15.pwconv2.bias", "stages.2.16.gamma", "stages.2.16.dwconv.weight", "stages.2.16.dwconv.bias", "stages.2.16.norm.weight", "stages.2.16.norm.bias", "stages.2.16.pwconv1.weight", "stages.2.16.pwconv1.bias", "stages.2.16.pwconv2.weight", "stages.2.16.pwconv2.bias", "stages.2.17.gamma", "stages.2.17.dwconv.weight", "stages.2.17.dwconv.bias", "stages.2.17.norm.weight", "stages.2.17.norm.bias", "stages.2.17.pwconv1.weight", "stages.2.17.pwconv1.bias", "stages.2.17.pwconv2.weight", "stages.2.17.pwconv2.bias", "stages.2.18.gamma", "stages.2.18.dwconv.weight", "stages.2.18.dwconv.bias", "stages.2.18.norm.weight", "stages.2.18.norm.bias", "stages.2.18.pwconv1.weight", "stages.2.18.pwconv1.bias", "stages.2.18.pwconv2.weight", "stages.2.18.pwconv2.bias", "stages.2.19.gamma", "stages.2.19.dwconv.weight", "stages.2.19.dwconv.bias", "stages.2.19.norm.weight", "stages.2.19.norm.bias", "stages.2.19.pwconv1.weight", "stages.2.19.pwconv1.bias", "stages.2.19.pwconv2.weight", "stages.2.19.pwconv2.bias", "stages.2.20.gamma", "stages.2.20.dwconv.weight", "stages.2.20.dwconv.bias", "stages.2.20.norm.weight", "stages.2.20.norm.bias", "stages.2.20.pwconv1.weight", "stages.2.20.pwconv1.bias", "stages.2.20.pwconv2.weight", "stages.2.20.pwconv2.bias", "stages.2.21.gamma", "stages.2.21.dwconv.weight", "stages.2.21.dwconv.bias", "stages.2.21.norm.weight", "stages.2.21.norm.bias", "stages.2.21.pwconv1.weight", "stages.2.21.pwconv1.bias", "stages.2.21.pwconv2.weight", "stages.2.21.pwconv2.bias", "stages.2.22.gamma", "stages.2.22.dwconv.weight", "stages.2.22.dwconv.bias", "stages.2.22.norm.weight", "stages.2.22.norm.bias", "stages.2.22.pwconv1.weight", "stages.2.22.pwconv1.bias", "stages.2.22.pwconv2.weight", "stages.2.22.pwconv2.bias", "stages.2.23.gamma", "stages.2.23.dwconv.weight", "stages.2.23.dwconv.bias", "stages.2.23.norm.weight", "stages.2.23.norm.bias", "stages.2.23.pwconv1.weight", "stages.2.23.pwconv1.bias", "stages.2.23.pwconv2.weight", "stages.2.23.pwconv2.bias", "stages.2.24.gamma", "stages.2.24.dwconv.weight", "stages.2.24.dwconv.bias", "stages.2.24.norm.weight", "stages.2.24.norm.bias", "stages.2.24.pwconv1.weight", "stages.2.24.pwconv1.bias", "stages.2.24.pwconv2.weight", "stages.2.24.pwconv2.bias", "stages.2.25.gamma", "stages.2.25.dwconv.weight", "stages.2.25.dwconv.bias", "stages.2.25.norm.weight", "stages.2.25.norm.bias", "stages.2.25.pwconv1.weight", "stages.2.25.pwconv1.bias", "stages.2.25.pwconv2.weight", "stages.2.25.pwconv2.bias", "stages.2.26.gamma", "stages.2.26.dwconv.weight", "stages.2.26.dwconv.bias", "stages.2.26.norm.weight", "stages.2.26.norm.bias", "stages.2.26.pwconv1.weight", "stages.2.26.pwconv1.bias", "stages.2.26.pwconv2.weight", "stages.2.26.pwconv2.bias", "stages.3.0.gamma", "stages.3.0.dwconv.weight", "stages.3.0.dwconv.bias", "stages.3.0.norm.weight", "stages.3.0.norm.bias", "stages.3.0.pwconv1.weight", "stages.3.0.pwconv1.bias", "stages.3.0.pwconv2.weight", "stages.3.0.pwconv2.bias", "stages.3.1.gamma", "stages.3.1.dwconv.weight", "stages.3.1.dwconv.bias", "stages.3.1.norm.weight", "stages.3.1.norm.bias", "stages.3.1.pwconv1.weight", "stages.3.1.pwconv1.bias", "stages.3.1.pwconv2.weight", "stages.3.1.pwconv2.bias", "stages.3.2.gamma", "stages.3.2.dwconv.weight", "stages.3.2.dwconv.bias", "stages.3.2.norm.weight", "stages.3.2.norm.bias", "stages.3.2.pwconv1.weight", "stages.3.2.pwconv1.bias", "stages.3.2.pwconv2.weight", "stages.3.2.pwconv2.bias", "norm.weight", "norm.bias", "head.weight", "head.bias". 
Traceback (most recent call last):
ds2268 commented 7 months ago

I am trying to resume finetuning from the following checkpoint: convnext_base_1kfinetuned_last.pth

keyu-tian commented 7 months ago

@ds2268 can you kindly check the readme or somewhere that gives example for resuming training? I'm currently in my finals week and would come back here next week.