Closed 25icecreamflavors closed 2 years ago
Hi,
Unfortunately due to limited resources, we were only able to train CLIP and CyCLIP with RN50. Our initial experiments had suggested that CLIP trained with RN50 performs better than any other variants of visual encoders at our scales.
Check out this issue for more discussion on this: https://github.com/mlfoundations/open_clip/issues/14
Thank you very much for your answer.
Hello, in your example you use RN50 with your checkpoint weights. Is it possible to load, for example, ViT14 and use your checkpoints? Or there checkpoints only for RN50?