nocaps-org / updown-baseline

Baseline model for nocaps benchmark, ICCV 2019 paper "nocaps: novel object captioning at scale".
https://nocaps.org
MIT License
75 stars 12 forks source link

size mismatch Issue while running inference file. Can anyone help? #12

Open Mas-Y opened 3 years ago

Mas-Y commented 3 years ago

python scripts/inference.py --config configs/updown_nocaps_val.yaml --checkpoint-path checkpoints/updown.pth --output-path /results/predictions.json --gpu-ids 0 RANDOM_SEED: 0 DATA: CBS: CLASS_HIERARCHY: data/cbs/class_hierarchy.json INFER_BOXES: data/nocaps_val_oi_detector_boxes.json MAX_GIVEN_CONSTRAINTS: 3 MAX_WORDS_PER_CONSTRAINT: 3 NMS_THRESHOLD: 0.85 WORDFORMS: data/cbs/constraint_wordforms.tsv INFER_CAPTIONS: data/nocaps/nocaps_val_image_info.json INFER_FEATURES: data/nocaps_val_vg_detector_features_adaptive.h5 MAX_CAPTION_LENGTH: 20 TRAIN_CAPTIONS: data/coco/captions_train2017.json TRAIN_FEATURES: data/coco_train2017_vg_detector_features_adaptive.h5 VOCABULARY: data/vocabulary MODEL:

ATTENTION_PROJECTION_SIZE: 768 BEAM_SIZE: 5 EMBEDDING_SIZE: 1000 HIDDEN_SIZE: 1200 IMAGE_FEATURE_SIZE: 2048 MIN_CONSTRAINTS_TO_SATISFY: 2 USE_CBS: False OPTIM: BATCH_SIZE: 150 CLIP_GRADIENTS: 12.5 LR: 0.015 MOMENTUM: 0.9 NUM_ITERATIONS: 70000 WEIGHT_DECAY: 0.001

config : configs/updown_nocaps_val.yaml config_override : [] gpu_ids : [0] cpu_workers : 0 in_memory : False checkpoint_path : checkpoints/updown.pth output_path : /results/predictions.json evalai_submit : False Traceback (most recent call last): File "scripts/inference.py", line 117, in model.load_state_dict(torch.load(_A.checkpoint_path)["model"]) File "/root/anaconda3/envs/updown/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for UpDownCaptioner: size mismatch for _embedding_layer.weight: copying a param with shape torch.Size([10315, 1000]) from checkpoint, the shape in current model is torch.Size([10306, 1000]). size mismatch for _output_layer.weight: copying a param with shape torch.Size([10315, 1200]) from checkpoint, the shape in current model is torch.Size([10306, 1200]). size mismatch for _output_layer.bias: copying a param with shape torch.Size([10315]) from checkpoint, the shape in current model is torch.Size([10306]).

kdexd commented 3 years ago

Hi @Mas-Y, I am unable to reproduce this issue at my end. Will post updates here if I am able to.

olvrhhn commented 2 years ago

First of all thank you for your great work and for providing the code! I get the same error as @Mas-Y. Seems like you had a different sized vocabulary. Do you have an idea where I should first look for the error?