RERV / VDT

[ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding.
Other
211 stars 13 forks source link

Some confusion about the code. #2

Closed jiangchaokang closed 1 year ago

jiangchaokang commented 1 year ago

Is the network model framework in the inference code you provide consistent with that during training? Are there inconsistencies between the training and inference codes? Great job and looking forward to your reply, thanks in advance.

RERV commented 1 year ago

Thank you for your question!

In the inference code we provide, we ensure that the same network model structure used during training (including unconditional training process, conditional training process and our up-to-date unified mask training process) is maintained.