antoyang / TubeDETR

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
Apache License 2.0
167 stars 8 forks source link

KeyError:'model' in main.py 552 #5

Closed Swt2000 closed 2 years ago

Swt2000 commented 2 years ago

I downloaded the checkpoint file from the download link found on the pytorch official website according to the instructions of the readme file. After importing, I did not find the key——"model" or "model_ema" for checkpoint. The download link is https://download.pytorch.org/models/resnet101-63fe2227.pth

The checkpoint output is: conv1.weight bn1.running_mean bn1.running_var bn1.weight bn1.bias layer1.0.conv1.weight layer1.0.bn1.running_mean layer1.0.bn1.running_var layer1.0.bn1.weight layer1.0.bn1.bias layer1.0.conv2.weight layer1.0.bn2.running_mean layer1.0.bn2.running_var layer1.0.bn2.weight layer1.0.bn2.bias layer1.0.conv3.weight layer1.0.bn3.running_mean layer1.0.bn3.running_var layer1.0.bn3.weight layer1.0.bn3.bias layer1.0.downsample.0.weight layer1.0.downsample.1.running_mean layer1.0.downsample.1.running_var layer1.0.downsample.1.weight layer1.0.downsample.1.bias layer1.1.conv1.weight layer1.1.bn1.running_mean layer1.1.bn1.running_var layer1.1.bn1.weight layer1.1.bn1.bias layer1.1.conv2.weight layer1.1.bn2.running_mean layer1.1.bn2.running_var layer1.1.bn2.weight layer1.1.bn2.bias layer1.1.conv3.weight layer1.1.bn3.running_mean layer1.1.bn3.running_var layer1.1.bn3.weight layer1.1.bn3.bias layer1.2.conv1.weight layer1.2.bn1.running_mean layer1.2.bn1.running_var layer1.2.bn1.weight layer1.2.bn1.bias layer1.2.conv2.weight layer1.2.bn2.running_mean layer1.2.bn2.running_var layer1.2.bn2.weight layer1.2.bn2.bias layer1.2.conv3.weight layer1.2.bn3.running_mean layer1.2.bn3.running_var layer1.2.bn3.weight layer1.2.bn3.bias layer2.0.conv1.weight layer2.0.bn1.running_mean layer2.0.bn1.running_var layer2.0.bn1.weight layer2.0.bn1.bias layer2.0.conv2.weight layer2.0.bn2.running_mean layer2.0.bn2.running_var layer2.0.bn2.weight layer2.0.bn2.bias layer2.0.conv3.weight layer2.0.bn3.running_mean layer2.0.bn3.running_var layer2.0.bn3.weight layer2.0.bn3.bias layer2.0.downsample.0.weight layer2.0.downsample.1.running_mean layer2.0.downsample.1.running_var layer2.0.downsample.1.weight layer2.0.downsample.1.bias layer2.1.conv1.weight layer2.1.bn1.running_mean layer2.1.bn1.running_var layer2.1.bn1.weight layer2.1.bn1.bias layer2.1.conv2.weight layer2.1.bn2.running_mean layer2.1.bn2.running_var layer2.1.bn2.weight layer2.1.bn2.bias layer2.1.conv3.weight layer2.1.bn3.running_mean layer2.1.bn3.running_var layer2.1.bn3.weight layer2.1.bn3.bias layer2.2.conv1.weight layer2.2.bn1.running_mean layer2.2.bn1.running_var layer2.2.bn1.weight layer2.2.bn1.bias layer2.2.conv2.weight layer2.2.bn2.running_mean layer2.2.bn2.running_var layer2.2.bn2.weight layer2.2.bn2.bias layer2.2.conv3.weight layer2.2.bn3.running_mean layer2.2.bn3.running_var layer2.2.bn3.weight layer2.2.bn3.bias layer2.3.conv1.weight layer2.3.bn1.running_mean layer2.3.bn1.running_var layer2.3.bn1.weight layer2.3.bn1.bias layer2.3.conv2.weight layer2.3.bn2.running_mean layer2.3.bn2.running_var layer2.3.bn2.weight layer2.3.bn2.bias layer2.3.conv3.weight layer2.3.bn3.running_mean layer2.3.bn3.running_var layer2.3.bn3.weight layer2.3.bn3.bias layer3.0.conv1.weight layer3.0.bn1.running_mean layer3.0.bn1.running_var layer3.0.bn1.weight layer3.0.bn1.bias layer3.0.conv2.weight layer3.0.bn2.running_mean layer3.0.bn2.running_var layer3.0.bn2.weight layer3.0.bn2.bias layer3.0.conv3.weight layer3.0.bn3.running_mean layer3.0.bn3.running_var layer3.0.bn3.weight layer3.0.bn3.bias layer3.0.downsample.0.weight layer3.0.downsample.1.running_mean layer3.0.downsample.1.running_var layer3.0.downsample.1.weight layer3.0.downsample.1.bias layer3.1.conv1.weight layer3.1.bn1.running_mean layer3.1.bn1.running_var layer3.1.bn1.weight layer3.1.bn1.bias layer3.1.conv2.weight layer3.1.bn2.running_mean layer3.1.bn2.running_var layer3.1.bn2.weight layer3.1.bn2.bias layer3.1.conv3.weight layer3.1.bn3.running_mean layer3.1.bn3.running_var layer3.1.bn3.weight layer3.1.bn3.bias layer3.2.conv1.weight layer3.2.bn1.running_mean layer3.2.bn1.running_var layer3.2.bn1.weight layer3.2.bn1.bias layer3.2.conv2.weight layer3.2.bn2.running_mean layer3.2.bn2.running_var layer3.2.bn2.weight layer3.2.bn2.bias layer3.2.conv3.weight layer3.2.bn3.running_mean layer3.2.bn3.running_var layer3.2.bn3.weight layer3.2.bn3.bias layer3.3.conv1.weight layer3.3.bn1.running_mean layer3.3.bn1.running_var layer3.3.bn1.weight layer3.3.bn1.bias layer3.3.conv2.weight layer3.3.bn2.running_mean layer3.3.bn2.running_var layer3.3.bn2.weight layer3.3.bn2.bias layer3.3.conv3.weight layer3.3.bn3.running_mean layer3.3.bn3.running_var layer3.3.bn3.weight layer3.3.bn3.bias layer3.4.conv1.weight layer3.4.bn1.running_mean layer3.4.bn1.running_var layer3.4.bn1.weight layer3.4.bn1.bias layer3.4.conv2.weight layer3.4.bn2.running_mean layer3.4.bn2.running_var layer3.4.bn2.weight layer3.4.bn2.bias layer3.4.conv3.weight layer3.4.bn3.running_mean layer3.4.bn3.running_var layer3.4.bn3.weight layer3.4.bn3.bias layer3.5.conv1.weight layer3.5.bn1.running_mean layer3.5.bn1.running_var layer3.5.bn1.weight layer3.5.bn1.bias layer3.5.conv2.weight layer3.5.bn2.running_mean layer3.5.bn2.running_var layer3.5.bn2.weight layer3.5.bn2.bias layer3.5.conv3.weight layer3.5.bn3.running_mean layer3.5.bn3.running_var layer3.5.bn3.weight layer3.5.bn3.bias layer3.6.conv1.weight layer3.6.bn1.running_mean layer3.6.bn1.running_var layer3.6.bn1.weight layer3.6.bn1.bias layer3.6.conv2.weight layer3.6.bn2.running_mean layer3.6.bn2.running_var layer3.6.bn2.weight layer3.6.bn2.bias layer3.6.conv3.weight layer3.6.bn3.running_mean layer3.6.bn3.running_var layer3.6.bn3.weight layer3.6.bn3.bias layer3.7.conv1.weight layer3.7.bn1.running_mean layer3.7.bn1.running_var layer3.7.bn1.weight layer3.7.bn1.bias layer3.7.conv2.weight layer3.7.bn2.running_mean layer3.7.bn2.running_var layer3.7.bn2.weight layer3.7.bn2.bias layer3.7.conv3.weight layer3.7.bn3.running_mean layer3.7.bn3.running_var layer3.7.bn3.weight layer3.7.bn3.bias layer3.8.conv1.weight layer3.8.bn1.running_mean layer3.8.bn1.running_var layer3.8.bn1.weight layer3.8.bn1.bias layer3.8.conv2.weight layer3.8.bn2.running_mean layer3.8.bn2.running_var layer3.8.bn2.weight layer3.8.bn2.bias layer3.8.conv3.weight layer3.8.bn3.running_mean layer3.8.bn3.running_var layer3.8.bn3.weight layer3.8.bn3.bias layer3.9.conv1.weight layer3.9.bn1.running_mean layer3.9.bn1.running_var layer3.9.bn1.weight layer3.9.bn1.bias layer3.9.conv2.weight layer3.9.bn2.running_mean layer3.9.bn2.running_var layer3.9.bn2.weight layer3.9.bn2.bias layer3.9.conv3.weight layer3.9.bn3.running_mean layer3.9.bn3.running_var layer3.9.bn3.weight layer3.9.bn3.bias layer3.10.conv1.weight layer3.10.bn1.running_mean layer3.10.bn1.running_var layer3.10.bn1.weight layer3.10.bn1.bias layer3.10.conv2.weight layer3.10.bn2.running_mean layer3.10.bn2.running_var layer3.10.bn2.weight layer3.10.bn2.bias layer3.10.conv3.weight layer3.10.bn3.running_mean layer3.10.bn3.running_var layer3.10.bn3.weight layer3.10.bn3.bias layer3.11.conv1.weight layer3.11.bn1.running_mean layer3.11.bn1.running_var layer3.11.bn1.weight layer3.11.bn1.bias layer3.11.conv2.weight layer3.11.bn2.running_mean layer3.11.bn2.running_var layer3.11.bn2.weight layer3.11.bn2.bias layer3.11.conv3.weight layer3.11.bn3.running_mean layer3.11.bn3.running_var layer3.11.bn3.weight layer3.11.bn3.bias layer3.12.conv1.weight layer3.12.bn1.running_mean layer3.12.bn1.running_var layer3.12.bn1.weight layer3.12.bn1.bias layer3.12.conv2.weight layer3.12.bn2.running_mean layer3.12.bn2.running_var layer3.12.bn2.weight layer3.12.bn2.bias layer3.12.conv3.weight layer3.12.bn3.running_mean layer3.12.bn3.running_var layer3.12.bn3.weight layer3.12.bn3.bias layer3.13.conv1.weight layer3.13.bn1.running_mean layer3.13.bn1.running_var layer3.13.bn1.weight layer3.13.bn1.bias layer3.13.conv2.weight layer3.13.bn2.running_mean layer3.13.bn2.running_var layer3.13.bn2.weight layer3.13.bn2.bias layer3.13.conv3.weight layer3.13.bn3.running_mean layer3.13.bn3.running_var layer3.13.bn3.weight layer3.13.bn3.bias layer3.14.conv1.weight layer3.14.bn1.running_mean layer3.14.bn1.running_var layer3.14.bn1.weight layer3.14.bn1.bias layer3.14.conv2.weight layer3.14.bn2.running_mean layer3.14.bn2.running_var layer3.14.bn2.weight layer3.14.bn2.bias layer3.14.conv3.weight layer3.14.bn3.running_mean layer3.14.bn3.running_var layer3.14.bn3.weight layer3.14.bn3.bias layer3.15.conv1.weight layer3.15.bn1.running_mean layer3.15.bn1.running_var layer3.15.bn1.weight layer3.15.bn1.bias layer3.15.conv2.weight layer3.15.bn2.running_mean layer3.15.bn2.running_var layer3.15.bn2.weight layer3.15.bn2.bias layer3.15.conv3.weight layer3.15.bn3.running_mean layer3.15.bn3.running_var layer3.15.bn3.weight layer3.15.bn3.bias layer3.16.conv1.weight layer3.16.bn1.running_mean layer3.16.bn1.running_var layer3.16.bn1.weight layer3.16.bn1.bias layer3.16.conv2.weight layer3.16.bn2.running_mean layer3.16.bn2.running_var layer3.16.bn2.weight layer3.16.bn2.bias layer3.16.conv3.weight layer3.16.bn3.running_mean layer3.16.bn3.running_var layer3.16.bn3.weight layer3.16.bn3.bias layer3.17.conv1.weight layer3.17.bn1.running_mean layer3.17.bn1.running_var layer3.17.bn1.weight layer3.17.bn1.bias layer3.17.conv2.weight layer3.17.bn2.running_mean layer3.17.bn2.running_var layer3.17.bn2.weight layer3.17.bn2.bias layer3.17.conv3.weight layer3.17.bn3.running_mean layer3.17.bn3.running_var layer3.17.bn3.weight layer3.17.bn3.bias layer3.18.conv1.weight layer3.18.bn1.running_mean layer3.18.bn1.running_var layer3.18.bn1.weight layer3.18.bn1.bias layer3.18.conv2.weight layer3.18.bn2.running_mean layer3.18.bn2.running_var layer3.18.bn2.weight layer3.18.bn2.bias layer3.18.conv3.weight layer3.18.bn3.running_mean layer3.18.bn3.running_var layer3.18.bn3.weight layer3.18.bn3.bias layer3.19.conv1.weight layer3.19.bn1.running_mean layer3.19.bn1.running_var layer3.19.bn1.weight layer3.19.bn1.bias layer3.19.conv2.weight layer3.19.bn2.running_mean layer3.19.bn2.running_var layer3.19.bn2.weight layer3.19.bn2.bias layer3.19.conv3.weight layer3.19.bn3.running_mean layer3.19.bn3.running_var layer3.19.bn3.weight layer3.19.bn3.bias layer3.20.conv1.weight layer3.20.bn1.running_mean layer3.20.bn1.running_var layer3.20.bn1.weight layer3.20.bn1.bias layer3.20.conv2.weight layer3.20.bn2.running_mean layer3.20.bn2.running_var layer3.20.bn2.weight layer3.20.bn2.bias layer3.20.conv3.weight layer3.20.bn3.running_mean layer3.20.bn3.running_var layer3.20.bn3.weight layer3.20.bn3.bias layer3.21.conv1.weight layer3.21.bn1.running_mean layer3.21.bn1.running_var layer3.21.bn1.weight layer3.21.bn1.bias layer3.21.conv2.weight layer3.21.bn2.running_mean layer3.21.bn2.running_var layer3.21.bn2.weight layer3.21.bn2.bias layer3.21.conv3.weight layer3.21.bn3.running_mean layer3.21.bn3.running_var layer3.21.bn3.weight layer3.21.bn3.bias layer3.22.conv1.weight layer3.22.bn1.running_mean layer3.22.bn1.running_var layer3.22.bn1.weight layer3.22.bn1.bias layer3.22.conv2.weight layer3.22.bn2.running_mean layer3.22.bn2.running_var layer3.22.bn2.weight layer3.22.bn2.bias layer3.22.conv3.weight layer3.22.bn3.running_mean layer3.22.bn3.running_var layer3.22.bn3.weight layer3.22.bn3.bias layer4.0.conv1.weight layer4.0.bn1.running_mean layer4.0.bn1.running_var layer4.0.bn1.weight layer4.0.bn1.bias layer4.0.conv2.weight layer4.0.bn2.running_mean layer4.0.bn2.running_var layer4.0.bn2.weight layer4.0.bn2.bias layer4.0.conv3.weight layer4.0.bn3.running_mean layer4.0.bn3.running_var layer4.0.bn3.weight layer4.0.bn3.bias layer4.0.downsample.0.weight layer4.0.downsample.1.running_mean layer4.0.downsample.1.running_var layer4.0.downsample.1.weight layer4.0.downsample.1.bias layer4.1.conv1.weight layer4.1.bn1.running_mean layer4.1.bn1.running_var layer4.1.bn1.weight layer4.1.bn1.bias layer4.1.conv2.weight layer4.1.bn2.running_mean layer4.1.bn2.running_var layer4.1.bn2.weight layer4.1.bn2.bias layer4.1.conv3.weight layer4.1.bn3.running_mean layer4.1.bn3.running_var layer4.1.bn3.weight layer4.1.bn3.bias layer4.2.conv1.weight layer4.2.bn1.running_mean layer4.2.bn1.running_var layer4.2.bn1.weight layer4.2.bn1.bias layer4.2.conv2.weight layer4.2.bn2.running_mean layer4.2.bn2.running_var layer4.2.bn2.weight layer4.2.bn2.bias layer4.2.conv3.weight layer4.2.bn3.running_mean layer4.2.bn3.running_var layer4.2.bn3.weight layer4.2.bn3.bias fc.weight fc.bias

antoyang commented 2 years ago

The downloading of these weights are useful for loading a pretrained visual backbone in https://github.com/antoyang/TubeDETR/blob/main/models/backbone.py, so it is normal that the backbone checkpoints do not have the same keys as the whole model. Such checkpoints shall not be put in the load / resume argument. However you can put a path to a model checkpoint provided in this repo if you wish to only do inference, the MDETR pretrained checkpoint if you wish to re-train the model, or no checkpoint at all if you wish to train from ImageNet initialization.

Swt2000 commented 2 years ago

Hello, thank you very much for your reply. I also understand that the checkpoint of the whole model should be different from that of backbone. The pre-trained checkpoint of backbone has been imported into main.py, but it seems to use the checkpoint of the whole model. Here's the corresponding code: (args.load is "pretrained_resnet101_checkpoint.pth" which is from https://download.pytorch.org/models/resnet101-63fe2227.pth according to the command "--load=pretrained_resnet101_checkpoint.pth")

    if args.load:
        print("loading from", args.load)
        checkpoint = torch.load(args.load, map_location="cpu")
        if "model_ema" in checkpoint:
            if (
                args.num_queries < 100
                and "query_embed.weight" in checkpoint["model_ema"]
            ):  # initialize from the first object queries
                checkpoint["model_ema"]["query_embed.weight"] = checkpoint["model_ema"][
                    "query_embed.weight"
                ][: args.num_queries]
            if "transformer.time_embed.te" in checkpoint["model_ema"]:
                del checkpoint["model_ema"]["transformer.time_embed.te"]
            model_without_ddp.load_state_dict(checkpoint["model_ema"], strict=False)
        else:
            if (
                args.num_queries < 100 and "query_embed.weight" in checkpoint["model"]
            ):  # initialize from the first object queries
                checkpoint["model"]["query_embed.weight"] = checkpoint["model"][
                    "query_embed.weight"
                ][: args.num_queries]
            if "transformer.time_embed.te" in checkpoint["model"]:
                del checkpoint["model"]["transformer.time_embed.te"]
            model_without_ddp.load_state_dict(checkpoint["model"], strict=False)
        if "pretrained_resnet101_checkpoint.pth" in args.load:
            model_without_ddp.transformer._reset_temporal_parameters()
        if args.ema:
            model_ema = deepcopy(model_without_ddp)

When I run this code, it goes to the 'else' branch. Maybe there is something wrong with the 'if else' usage.

antoyang commented 2 years ago

The "--load=pretrained_resnet101_checkpoint.pth" command refers to the checkpoint from MDETR: https://zenodo.org/record/4721981/files/pretrained_resnet101_checkpoint.pth?download=1, hence the "Download MDETR pretrained model weights with ResNet-101 backbone in the current folder." instruction.

Swt2000 commented 2 years ago

thk u. that works !