AILab-CVC / VideoCrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
https://ailab-cvc.github.io/videocrafter2/
Other
4.58k stars 342 forks source link

Version using CLIP from transformers #93

Open HyelinNAM opened 2 months ago

HyelinNAM commented 2 months ago

Thank you for your great work!

I recently switched from using OpenCLIP to transformers.CLIPTextModel as I needed some specific functions only available in the transformers implementation of CLIP. While everything is still working, I've noticed a slight drop in the quality of the results. I’m using the same ViT backbone, so I’m curious if you’ve trained models with the transformers version of CLIP and if you know why this gap in performance might occur.