Open trmzpi02 opened 1 month ago
Hi, thanks for your interest in our work. If you'd like to reproduce our results, you can try the pre-trained one. Besides, because the final checkpoint was fine-tuned on the llava data, further fine-tuning will degrade the performance (overfitting). If there is new training data, you can directly fine-tune the final checkpoint I think.
Hello! I am very interested in your work and see that you release the weight of Show-o before fine-tuning on LLaVA instructional tuning datasets.
I have the following two questions:
I see that you recommend in the README to go to finetune on the basis of the show-o-512x512-wo-llava-tuning checkpoint, so why don't go to finetune on the basis of the show-o-512x512. Is it because there is a performance degradation on certain downstream tasks after fine-tuning on LLaVA instructional tuning datasets?
If I want to fine-tune on certain visual downstream tasks, which checkpoint should I use?