GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
https://huggingface.co/spaces/ethanchern/Anole
641 stars 36 forks source link

where is vqgan decoder coming from #32

Closed Yuheng-Li closed 1 month ago

Yuheng-Li commented 1 month ago

Hi Team, Great work! I have a silly question, after reading the paper. it seems you only finetune the part of the Chameleon last head . But in order to generate image, one also needs to have VQGAN's decoder. How do you have this decoder? Did you train this model or Chameleon released this model? Thanks

Yuheng-Li commented 1 month ago

oh I just saw the issue #9 You just used the Chameleon released the decoder. thanks