Quick Question about Training

How do we train and run the MVDiT or STDiT transformer in the repo on OpenVidHD-0.4M? Do we just have to change the first seven lines of the mvdit configs file like this:

num_frames = 16
frame_interval = 3
image_size = (1080, 1920)

root = "dataset/OpenVidHD-0.4M/video"
data_path = "dataset/OpenVid-1M/data/train/OpenVidHD-0.4M.csv"

or are there some additional steps involved in training, if so please share the steps.

Also is it possible to train the MVDiT or STDiT transformer provided in the repo on 1080 x 1920 resolution videos at 60fps? If so, do we just have to change the first three lines of the mvdit configs file like this:

num_frames = 60
frame_interval = 1
image_size = (1080, 1920)

and supply the dataset path or are there some other steps involved(if so please share the steps)?

Thanks in advance!

NJU-PCALab / OpenVid-1M

Quick Question about Training #8