Closed ShramanPramanick closed 1 year ago
One follow-up (slightly unrelated) question. Have you tried to fine-tune the mm_projector layers in the 1st stage too? In the current version, the 1st stage only unfreezes the spi_module parameters. Do you have any comments on the effect of unfreezing projection parameters in the 1st stage?
I faced the following error when I launched the 2nd stage of pre-training.
ValueError: loaded state dict contains a parameter group that doesn’t match the size of optimizer’s group
This error is likely due to number of trainable parameters are different in the 2nd stage than the 1st stage. How did you resolved this?
Apologies for the delayed response. I have fixed the training script on fix load stage1 The error occurred because the autoresume function attempted to resume the optimizer from stage1
One follow-up (slightly unrelated) question. Have you tried to fine-tune the mm_projector layers in the 1st stage too? In the current version, the 1st stage only unfreezes the spi_module parameters. Do you have any comments on the effect of unfreezing projection parameters in the 1st stage?
The main reason I did not train with mm_projector is because it was pre-trained on a large amount of caption data in LLaVA. Using only a relatively small amount of region data in stage1 may actually decrease its expressive ability. However, if I were to apply object detection on the caption data in the same way as what I did for Llava150k, and add caption data to the first stage1, I believe it would improve the results.
I faced the following error when I launched the 2nd stage of pre-training.
ValueError: loaded state dict contains a parameter group that doesn’t match the size of optimizer’s group
This error is likely due to number of trainable parameters are different in the 2nd stage than the 1st stage. How did you resolved this?
I have published the checkpoint and polished the README today, so you can now pull them and use this repository more smoothly.
Thanks @jshilong for open-sourcing your model. Thanks for your help!
I faced the following error when I launched the 2nd stage of pre-training.
ValueError: loaded state dict contains a parameter group that doesn’t match the size of optimizer’s group
This error is likely due to number of trainable parameters are different in the 2nd stage than the 1st stage. How did you resolved this?