kennethwdk / LocLLM

Code for "LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model", CVPR 2024 Highlight
MIT License
31 stars 3 forks source link

Model checkpoints #3

Open SebastienLinker opened 4 months ago

SebastienLinker commented 4 months ago

Hi,

Thanks for this amazing work.

I tried to run the demo from the readme, but I cannot get any keypoint and get errors from torch.load() function, so I am wondering whether I downloaded all the required data and got the right checkpoints.

Do we need to retrain on our end? Can you guys also provide a fully functional model?

dino init: _IncompatibleKeys(missing_keys=['blocks.0.attn.lora_A_v', 'blocks.0.attn.lora_B_v', 'blocks.0.attn.lora_A_q', 'blocks.0.attn.lora_B_q', 'blocks.1.attn.lora_A_v', 'blocks.1.attn.lora_B_v', 'blocks.1.attn.lora_A_q', 'blocks.1.attn.lora_B_q', 'blocks.2.attn.lora_A_v', 'blocks.2.attn.lora_B_v', 'blocks.2.attn.lora_A_q', 'blocks.2.attn.lora_B_q', 'blocks.3.attn.lora_A_v', 'blocks.3.attn.lora_B_v', 'blocks.3.attn.lora_A_q', 'blocks.3.attn.lora_B_q', 'blocks.4.attn.lora_A_v', 'blocks.4.attn.lora_B_v', 'blocks.4.attn.lora_A_q', 'blocks.4.attn.lora_B_q', 'blocks.5.attn.lora_A_v', 'blocks.5.attn.lora_B_v', 'blocks.5.attn.lora_A_q', 'blocks.5.attn.lora_B_q', 'blocks.6.attn.lora_A_v', 'blocks.6.attn.lora_B_v', 'blocks.6.attn.lora_A_q', 'blocks.6.attn.lora_B_q', 'blocks.7.attn.lora_A_v', 'blocks.7.attn.lora_B_v', 'blocks.7.attn.lora_A_q', 'blocks.7.attn.lora_B_q', 'blocks.8.attn.lora_A_v', 'blocks.8.attn.lora_B_v', 'blocks.8.attn.lora_A_q', 'blocks.8.attn.lora_B_q', 'blocks.9.attn.lora_A_v', 'blocks.9.attn.lora_B_v', 'blocks.9.attn.lora_A_q', 'blocks.9.attn.lora_B_q', 'blocks.10.attn.lora_A_v', 'blocks.10.attn.lora_B_v', 'blocks.10.attn.lora_A_q', 'blocks.10.attn.lora_B_q', 'blocks.11.attn.lora_A_v', 'blocks.11.attn.lora_B_v', 'blocks.11.attn.lora_A_q', 'blocks.11.attn.lora_B_q', 'blocks.12.attn.lora_A_v', 'blocks.12.attn.lora_B_v', 'blocks.12.attn.lora_A_q', 'blocks.12.attn.lora_B_q', 'blocks.13.attn.lora_A_v', 'blocks.13.attn.lora_B_v', 'blocks.13.attn.lora_A_q', 'blocks.13.attn.lora_B_q', 'blocks.14.attn.lora_A_v', 'blocks.14.attn.lora_B_v', 'blocks.14.attn.lora_A_q', 'blocks.14.attn.lora_B_q', 'blocks.15.attn.lora_A_v', 'blocks.15.attn.lora_B_v', 'blocks.15.attn.lora_A_q', 'blocks.15.attn.lora_B_q', 'blocks.16.attn.lora_A_v', 'blocks.16.attn.lora_B_v', 'blocks.16.attn.lora_A_q', 'blocks.16.attn.lora_B_q', 'blocks.17.attn.lora_A_v', 'blocks.17.attn.lora_B_v', 'blocks.17.attn.lora_A_q', 'blocks.17.attn.lora_B_q', 'blocks.18.attn.lora_A_v', 'blocks.18.attn.lora_B_v', 'blocks.18.attn.lora_A_q', 'blocks.18.attn.lora_B_q', 'blocks.19.attn.lora_A_v', 'blocks.19.attn.lora_B_v', 'blocks.19.attn.lora_A_q', 'blocks.19.attn.lora_B_q', 'blocks.20.attn.lora_A_v', 'blocks.20.attn.lora_B_v', 'blocks.20.attn.lora_A_q', 'blocks.20.attn.lora_B_q', 'blocks.21.attn.lora_A_v', 'blocks.21.attn.lora_B_v', 'blocks.21.attn.lora_A_q', 'blocks.21.attn.lora_B_q', 'blocks.22.attn.lora_A_v', 'blocks.22.attn.lora_B_v', 'blocks.22.attn.lora_A_q', 'blocks.22.attn.lora_B_q', 'blocks.23.attn.lora_A_v', 'blocks.23.attn.lora_B_v', 'blocks.23.attn.lora_A_q', 'blocks.23.attn.lora_B_q'], unexpected_keys=[])
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:14<00:00,  7.19s/it]
Some weights of the model checkpoint at /home/.../checkpoints/ckpts/coco were not used when initializing LocLLMModel: ['vision_model.blocks.11.ls2.weight', 'vision_model.blocks.17.ls1.weight', 'vision_model.blocks.12.ls2.weight', 'vision_model.blocks.14.ls1.weight', 'vision_model.blocks.18.ls2.weight', 'vision_model.blocks.15.ls1.weight', 'vision_model.blocks.2.ls1.weight', 'vision_model.blocks.0.ls2.weight', 'vision_model.blocks.10.ls2.weight', 'vision_model.blocks.1.ls1.weight', 'vision_model.blocks.22.ls2.weight', 'vision_model.blocks.8.ls1.weight', 'vision_model.blocks.22.ls1.weight', 'vision_model.blocks.3.ls2.weight', 'vision_model.blocks.2.ls2.weight', 'vision_model.blocks.19.ls2.weight', 'vision_model.blocks.11.ls1.weight', 'vision_model.blocks.7.ls2.weight', 'vision_model.blocks.1.ls2.weight', 'vision_model.blocks.16.ls1.weight', 'vision_model.blocks.9.ls2.weight', 'vision_model.blocks.20.ls2.weight', 'vision_model.blocks.6.ls1.weight', 'vision_model.blocks.17.ls2.weight', 'vision_model.blocks.23.ls2.weight', 'vision_model.blocks.4.ls1.weight', 'vision_model.blocks.3.ls1.weight', 'vision_model.blocks.5.ls1.weight', 'vision_model.blocks.13.ls1.weight', 'vision_model.blocks.18.ls1.weight', 'vision_model.blocks.14.ls2.weight', 'vision_model.blocks.13.ls2.weight', 'vision_model.blocks.4.ls2.weight', 'vision_model.blocks.15.ls2.weight', 'vision_model.blocks.9.ls1.weight', 'vision_model.blocks.21.ls1.weight', 'vision_model.blocks.19.ls1.weight', 'vision_model.blocks.8.ls2.weight', 'vision_model.blocks.10.ls1.weight', 'vision_model.blocks.0.ls1.weight', 'vision_model.blocks.7.ls1.weight', 'vision_model.blocks.16.ls2.weight', 'vision_model.blocks.23.ls1.weight', 'vision_model.blocks.6.ls2.weight', 'vision_model.blocks.12.ls1.weight', 'vision_model.blocks.21.ls2.weight', 'vision_model.blocks.20.ls1.weight', 'vision_model.blocks.5.ls2.weight']
- This IS expected if you are initializing LocLLMModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LocLLMModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of LocLLMModel were not initialized from the model checkpoint at /home/.../checkpoints/ckpts/coco and are newly initialized: ['vision_model.blocks.10.ls2.gamma', 'vision_model.blocks.12.ls2.gamma', 'vision_model.blocks.2.ls2.gamma', 'vision_model.blocks.12.ls1.gamma', 'vision_model.blocks.3.ls2.gamma', 'vision_model.blocks.1.ls1.gamma', 'vision_model.blocks.11.ls2.gamma', 'vision_model.blocks.17.ls1.gamma', 'vision_model.blocks.20.ls1.gamma', 'vision_model.blocks.5.ls1.gamma', 'vision_model.blocks.3.ls1.gamma', 'vision_model.blocks.6.ls2.gamma', 'vision_model.blocks.16.ls1.gamma', 'vision_model.blocks.9.ls2.gamma', 'vision_model.blocks.23.ls2.gamma', 'vision_model.blocks.10.ls1.gamma', 'vision_model.blocks.18.ls1.gamma', 'vision_model.blocks.0.ls2.gamma', 'vision_model.blocks.18.ls2.gamma', 'vision_model.blocks.22.ls1.gamma', 'vision_model.blocks.5.ls2.gamma', 'vision_model.blocks.11.ls1.gamma', 'vision_model.blocks.19.ls2.gamma', 'vision_model.blocks.19.ls1.gamma', 'vision_model.blocks.4.ls1.gamma', 'vision_model.blocks.23.ls1.gamma', 'vision_model.blocks.8.ls1.gamma', 'vision_model.blocks.21.ls2.gamma', 'vision_model.blocks.22.ls2.gamma', 'vision_model.blocks.8.ls2.gamma', 'vision_model.blocks.4.ls2.gamma', 'vision_model.blocks.15.ls2.gamma', 'vision_model.blocks.1.ls2.gamma', 'vision_model.blocks.2.ls1.gamma', 'vision_model.blocks.20.ls2.gamma', 'vision_model.blocks.13.ls1.gamma', 'vision_model.blocks.17.ls2.gamma', 'vision_model.blocks.9.ls1.gamma', 'vision_model.blocks.21.ls1.gamma', 'vision_model.blocks.14.ls2.gamma', 'vision_model.blocks.7.ls2.gamma', 'vision_model.blocks.15.ls1.gamma', 'vision_model.blocks.13.ls2.gamma', 'vision_model.blocks.6.ls1.gamma', 'vision_model.blocks.14.ls1.gamma', 'vision_model.blocks.16.ls2.gamma', 'vision_model.blocks.7.ls1.gamma', 'vision_model.blocks.0.ls1.gamma']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.