Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation
MIT License
2.04k stars 86 forks source link

Do you think using a vae with more channels is good? #60

Closed Manni1000 closed 3 months ago

Manni1000 commented 3 months ago

very impressive work. i like that you are using a decoder llm. btw do you think using a vae with more channels would be good?

zhuole1025 commented 3 months ago

Thanks! Yes, definitely. We are experimenting with the 16-channel SD3 vae encoder.

Manni1000 commented 3 months ago

cool!