Open Xiuyuan-Chen opened 1 year ago
I wonder how the video checkpoint on hugging face was obtained? Did you use only picture data or video data as in the paper?thx!
I wonder how the video checkpoint on hugging face was obtained? Did you use only picture data or video data as in the paper?thx!