JacobYuan7 / RLIPv2

[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
Apache License 2.0
112 stars 3 forks source link

problem of the v-coco fully finetuned model #10

Closed safsfsvvea closed 8 months ago

safsfsvvea commented 8 months ago

Thanks for your brilliant work. I download the v-coco fully finetuned model from the provided link in the readme. When I try to inference with the command python generate_vcoco_official.py \ --param_path /PATH/TO/CHECKPOINT \ --save_path vcoco.pickle \ --hoi_path /PATH/TO/VCOCO/DATA \ There is an error: Traceback (most recent call last): File "generate_vcoco_official.py", line 594, in main(args) File "generate_vcoco_official.py", line 431, in main load_info = model.load_state_dict(checkpoint['model']) File "/XXX/anaconda3/envs/rlip/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for DETRHOI: Missing key(s) in state_dict: "transformer.encoder.layers.0.self_attn.in_proj_weight", "transformer.encoder.layers.0.self_attn.in_proj_bias", "transformer.encoder.layers.0.self_attn.out_proj.weight", "transformer.encoder.layers.0.self_attn.out_proj.bias", "transformer.encoder.layers.1.self_attn.in_proj_weight", ...... I try the Resnet50 and SwinT model, they both have the same problem, I wonder whether it is the problem of the downloaded checkponits. I sincerely hope that you can help me figure out what is the problem.

JacobYuan7 commented 8 months ago

@safsfsvvea I think the probability of wrong checkpoints is low. I think you should check the keys of the state_dict. That might help. Since I do not have further information, it might be hard for me to give further instructions. You might provide more.

JacobYuan7 commented 8 months ago

@safsfsvvea Btw, you should configure the model before loading the checkpoints (i.e., DETRHOI is not the model for RLIPv2.).

safsfsvvea commented 8 months ago

@safsfsvvea Btw, you should configure the model before loading the checkpoints (i.e., DETRHOI is not the model for RLIPv2.). Thanks for your quick reply! I am still a little confused. I want to inference and evaluate the fully fine tuned RLIPv2 model in vcoco to reproduce the result in the paper, I run the command python generate_vcoco_official.py --param_path /PATH/TO/CHECKPOINT --save_path vcoco.pickle --hoi_path /PATH/TO/VCOCO/DATA I use the checkpoint of the fully fine tuned RLIPv2 model in vcoco, RLIP_PDA_v2_VCOCO_R50_VGCOO365_COO365det_RQL_LSE_RPL_20e_L1_20e_checkpoint0019.pth. So should I replace it to the checkpoint in https://github.com/hitachi-rd-cv/qpic, qpic_resnet50_vcoco.pth. Or should I do other specific things to configure the DETRHOI model? By the way, I think DETRHOI is just working as a baseline for comparison and it should not be used in the evaluation of the fully fine tuned RLIPv2 model in vcoco?

JacobYuan7 commented 8 months ago

@safsfsvvea I think you might try out this file. https://github.com/JacobYuan7/RLIPv2/blob/main/scripts/RLIP_ParSeDA/test_vcoco_official.sh

I hope this helps.

safsfsvvea commented 8 months ago

I have successfully reproduced your result, thanks for your help!