Closed Yuezhengrong closed 7 months ago
Due to the classifier-free guidance, we will duplicate each batch. So when classifier-free guidance is enabled, unet_chunk_size=2 otherwise, unet_chunk_size=1
In you case, when running to attention layers, batch_size=16 rather than 8
由于无分类器指导,我们将复制每个批处理。 因此,当启用无分类器引导时,unet_chunk_size=2, 否则unet_chunk_size=1
在您的情况下,当运行到注意层时,batch_size=16 而不是 8
I understand. It's CFG! Thanks for the answer!
The 'video_len' in the source code is equal to 'batch_size // unet_chunk_size'. But if 8 frames are fed in at once, isn't 'batch_size' equal to 'video_len'?