Closed Froskekongen closed 2 years ago
@Froskekongen Hi Erlend! Took up your suggestion here https://github.com/lucidrains/x-clip/tree/0.2.4#custom-vision-self-supervised-learning-module let me know if that works for you
How is your experience with Barlow? Does it work?
Thanks a lot!
BarlowTwins was just an example. Personally, I work with frameworks that are more akin to VICReg (https://arxiv.org/abs/2105.04906) and VIbCreg (https://arxiv.org/abs/2109.00783).
And I am investigating CLIP with other modalities than images and words, with less data.
@Froskekongen awesome! hope this feature is fruitful for you then! :)
In the following code as part of CLIP.
__init__
the visual self-supervised learning is hardcoded. I would suggest changing this to accept the visual SSL module as an argument when instantiating CLIP to allow flexibility in the same manner as it does for the image encoder and text encoder.
Example: