jy0205 / Pyramid-Flow

Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
https://pyramid-flow.github.io/
MIT License
2.4k stars 233 forks source link

why are the output of get_vae_latent different from input? #165

Closed tangdong1994 closed 2 weeks ago

tangdong1994 commented 2 weeks ago

when i input eg: video = torch.randn(4,16,8,512,512) noisylatent, , , = get_vae_latent(video) noisy_latent[0][0].shape = 1,16,8,128,128 I am puzzled as to why the batch size I input is 4, but the resulting noisy latent batch size becomes 1.

tangdong1994 commented 2 weeks ago

Ah, I get it, it has been transformed into a list of length 4.