nihaomiao / CVPR23_LFDM

The pytorch implementation of our CVPR 2023 paper "Conditional Image-to-Video Generation with Latent Flow Diffusion Models"
BSD 2-Clause "Simplified" License
432 stars 43 forks source link

Why is the grid generated well in the demo file, but the grid generated in the train_video_flow_diffusion_mhad_multiGPU file is distorted? #35

Closed Foxerity closed 8 months ago

Foxerity commented 8 months ago

the grid generated well in the demo file 200_0000_a11_s4_t1_000_1 00 the grid generated in the train_video_flow_diffusion_mhad_multiGPU file B0010_S000000_a8_s1_t3_0 I am referring to the GT grid. The grid in the demo is square, but in the file below is folded. I'm using the same AE weights.

Foxerity commented 8 months ago

I mean the grid in the first row of the picture below. I know the lower row is because it is not trained well.

nihaomiao commented 8 months ago

Hi, @Foxerity, have you loaded the correct pre-trained stage-one latent autoencoder? The ground truth flow should not have any issues.