Closed Dingry closed 1 year ago
Hey @Dingry, That experiment was only an ablation study in the paper to testing if this kind of pretraining also works with smaller models too. In the standard model size (Res16UNet34D) we didn't projected the CLIP features down, but used them in their original space and dimensions.
I don't have here in the code the PCA projection, but that's something you could do easily yourself sklearn function call.
Kind regards, David
Thanks for your quick reply!
Hi, thanks for your great work. I noticed in the paper that you downsample the dimension of CLIP text embeding from 512 to 96 by PCA. May I ask where is the corresponding code of PCA projection process?