Open g-jing opened 19 hours ago
Hello, since the trajectories are extracted using the optical flow model during training, you only need to prepare the videos and text captions following the CogVideo5B format. We will release some examples as soon as possible.
Very interesting idea, why use the optical flow model for extracting trajectories?
I understanding it is hard to release the full training data. Is it possible to release a training sample so we can follow it on customized data? Thanks!