Loading a pre-trained model on single GPU?

linzhiqiu commented 3 years ago

I noticed that in "main_lincls.py", the default loading script is written as

            # rename moco pre-trained keys
            state_dict = checkpoint['state_dict']
            for k in list(state_dict.keys()):
                # retain only encoder_q up to before the embedding layer
                if k.startswith('module.encoder_q') and not k.startswith('module.encoder_q.fc'):
                    # remove prefix
                    state_dict[k[len("module.encoder_q."):]] = state_dict[k]
                # delete renamed or unused k
                del state_dict[k]

Could someone explain what these lines are doing? My experiments show that this won't work if we have only a single GPU.

beluis3d commented 3 years ago

Did you ever figure this out?

paganpasta commented 3 years ago

@linzhiqiu @beluis3d These lines are ensuring that the weights can be loaded by the model created previously from torchvision. Since it works by assigning weights by matching the names, the prefix from saved weights is being trimmed. I don't think there should be any issue with the number of GPUs and that code.

facebookresearch / moco

Loading a pre-trained model on single GPU? #88