Open diffusion-lover opened 2 months ago
Thanks for sharing a great work!
I found two folders for vae in t2v pretrained model: https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models
- vae
- vae_temporal_decoder
It seems that t2v model uses the "vae_temporal_decoder" pretrained model for decoding latents. The "vae" pretrained model is used to encode images when you train transformer network?
Yes, that is right.
Thanks for sharing a great work!
I found two folders for vae in t2v pretrained model: https://huggingface.co/maxin-cn/Latte/tree/main/t2v_required_models
It seems that t2v model uses the "vae_temporal_decoder" pretrained model for decoding latents. The "vae" pretrained model is used to encode images when you train transformer network?