Closed FromA2Z closed 10 months ago
thank you for your work, it is very helpful to the open source community. Regarding clips, do you use visual_projection layers when extracting image features as in the clip source code?
Dear @FromA2Z the function we used to encode our frames is the following: link. This follows the official OpenAI CLIP implementation.
thank you for your work, it is very helpful to the open source community. Regarding clips, do you use visual_projection layers when extracting image features as in the clip source code?