RERV / VDT

[ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding.
Other
211 stars 13 forks source link

Training Code and Dataset format? #6

Open BingliangLi opened 9 months ago

BingliangLi commented 9 months ago

Fantastic work! Since a few month have passed and the paper is accepted by ICLR(congrats!), would you please release the training code? Also some instructions on how to prepare the dataset would be great!

Related: https://github.com/RERV/VDT/issues/1#issuecomment-1813275903

yuedajiong commented 9 months ago

找一个diffusers或类似的trainer的代码,修改修改。

MeL0XIA commented 5 months ago

Fantastic work! Since a few month have passed and the paper is accepted by ICLR(congrats!), would you please release the training code? Also some instructions on how to prepare the dataset would be great!

Related: #1 (comment)

Have you fixed the training code? I have been using a training code, but I'm not sure where to add the temporal frame dimension. Could you please let me know? Thank you.

MeL0XIA commented 5 months ago

找一个diffusers或类似的trainer的代码,修改修改。

请问您有能够跑通的训练代码吗?我一直在以DIT的训练函数做修改,但是我不知道如何加入时间帧。请问您可以分享以下吗?

1585511010 commented 5 months ago

请问一下具体需要怎么修改DIT的训练代码呢,请问一下能分享一下不