Open Edwardmark opened 1 week ago
Specifically, when I set num_frames to 68, the t in stdit_block is 20, I cannot understand why is that. self.micro_frame_size = 17, why we need to use this setting 17? in vae_temporal.py line 371, why we use input[0]=17, and use padding?
def get_latent_size(self, input_size):
latent_size = []
for i in range(3):
if input_size[i] is None:
lsize = None
elif i == 0:
time_padding = (
0
if (input_size[i] % self.time_downsample_factor == 0)
else self.time_downsample_factor - input_size[i] % self.time_downsample_factor
)
lsize = (input_size[i] + time_padding) // self.patch_size[i]
else:
lsize = input_size[i] // self.patch_size[i]
latent_size.append(lsize)
return latent_size
Could you please explain a bit?Thanks. @sarroutbi @vjandrea @mahone3297 @duguyixiaono1
This issue is stale because it has been open for 7 days with no activity.
how to determine the num_frames given the time length? I want to use the config num_frames to generate video, how to calculate the num_frames when given time length t and fps? It seems pretty complex in the code. Could you please give me a simple equation? Thanks.