In the training step, do you shuffle clips? / Number of clips

Hi, thank you first for your implementation.

I have a few questions about your training process. (1) Did you fix the number of frames (clips) as 8? Does it impose that any number bigger or smaller than 8 doesn't perform as well as 8?

(2) In the training step, do you shuffle the order of frames (clips)? I have a feeling that it is not proper to shuffle the frames because the frame-related attention parts learn the order of frames too? But when instantiating DataLoader in your 'train.py', you set shuffle value as True. So I am wondering if you intentionally shuffled the frames and if it leads to better training.

Thank you again :)

bryandlee / Tune-A-Video

In the training step, do you shuffle clips? / Number of clips #10