Open Sora-Lite opened 8 months ago
VideoCraft claims that "We perturb the temporal modules while fixing the spatial modules with the image dataset". With the image dataset, the image sequence is not available, how to get the inputs for temporal module finetuneing?
VideoCraft claims that "We perturb the temporal modules while fixing the spatial modules with the image dataset". With the image dataset, the image sequence is not available, how to get the inputs for temporal module finetuneing?