RozDavid / LanguageGroundedSemseg

Implementation for ECCV 2022 paper Language-Grounded Indoor 3D Semantic Segmentation in the Wild
98 stars 14 forks source link

About PCA for CLIP text embeddings #6

Closed Dingry closed 1 year ago

Dingry commented 1 year ago

Hi, thanks for your great work. I noticed in the paper that you downsample the dimension of CLIP text embeding from 512 to 96 by PCA. May I ask where is the corresponding code of PCA projection process?

RozDavid commented 1 year ago

Hey @Dingry, That experiment was only an ablation study in the paper to testing if this kind of pretraining also works with smaller models too. In the standard model size (Res16UNet34D) we didn't projected the CLIP features down, but used them in their original space and dimensions.

I don't have here in the code the PCA projection, but that's something you could do easily yourself sklearn function call.

Kind regards, David

Dingry commented 1 year ago

Thanks for your quick reply!