question about the visual autoencoder

baaivision / Emu

Emu Series: Generative Multimodal Models from BAAI

Apache License 2.0

1.66k stars 86 forks source link

Thanks for your interest in our work!

The visual decoder of Emu2. As stated in the paper, we freeze the visual encoder during the training of Emu2-Gen and visual decoder. Hence, Emu2 and Emu2-Gen share exactly the same visual decoder. The visual decoder weights in Emu2-Gen can be directly used in Emu2.
The autoencoder paradigm Yes, the visual encoder and the visual decoder can work as an autoencoder. Our pipeline currently supports to generate the output in an autoenoding manner. You can find instructions at HF version model or native PyTorch version model(at the bottom part of the example codes).

baaivision / Emu