unable to load local weight

FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

https://groma-mllm.github.io/

Apache License 2.0

483 stars 55 forks source link

unable to load local weight #3

Closed liukc19 closed 2 months ago

liukc19 commented 2 months ago

I have manually downloaded the model weights from Hugging Face and tried fine-tuning the model using the following command. bash scripts/vl_finetune.sh ./groma-7b-pretrain ./train_history/ But the program still tries to access the weights from the website. I set the following output for debug. And this is the output: It seems that the program is unable to local model weight?

Looking forward to your reply.

liukc19 commented 2 months ago

Sorry, the issue seems to be occurring here. So what is the vis_encoder_path？

machuofan commented 2 months ago

Thanks for the feedback. The bug occurs as the program is looking for a local DINOv2-L checkpoint to initialize CustomDDETRModel. This is not an expected behavior. A quick fix is to delete vis_encoder_path: checkpoints/dinov2-large in groma-7b-pretrain/config.json. I will fix the initialization logic soon later.

liukc19 commented 2 months ago

"Thank you for your prompt feedback. I have encountered a new issue. Do you know why this is happening?"

liukc19 commented 2 months ago

I think this error comes from a failed installation of mmcv. Could you clarify the relationship between the mmcv folder in your repository and the mmcv package?

machuofan commented 2 months ago

We inherited the mmcv folder from GPT4ROI. I think it is originated from mmcv==1.4.7.