Closed lzn87 closed 1 month ago
Hi,
First off, thank you, authors, for open-sourcing this model! I am currently working on fine-tuning show-o on llava-style data. I noticed that input_ids_mmu is appended to input_ids for the non-clip version (see: https://github.com/showlab/Show-o/blob/7ce44993ef7f8b46c8fa374339beef17dc572033/training/train.py#L582C47-L582C60), but not for the clip-vit show-o. I am wondering if this is expected, and if so, why?
input_ids_mmu
input_ids
Thank you very much in advance!
Hi,
First off, thank you, authors, for open-sourcing this model! I am currently working on fine-tuning show-o on llava-style data. I noticed that
input_ids_mmu
is appended toinput_ids
for the non-clip version (see: https://github.com/showlab/Show-o/blob/7ce44993ef7f8b46c8fa374339beef17dc572033/training/train.py#L582C47-L582C60), but not for the clip-vit show-o. I am wondering if this is expected, and if so, why?Thank you very much in advance!