RERV / VDT

[ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding.
Other
195 stars 9 forks source link

Some confusion about the code. #2

Closed jiangchaokang closed 8 months ago

jiangchaokang commented 9 months ago

Is the network model framework in the inference code you provide consistent with that during training? Are there inconsistencies between the training and inference codes? Great job and looking forward to your reply, thanks in advance.

RERV commented 8 months ago

Thank you for your question!

In the inference code we provide, we ensure that the same network model structure used during training (including unconditional training process, conditional training process and our up-to-date unified mask training process) is maintained.