facebookresearch / ijepa

Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive architecture."
Other
2.83k stars 358 forks source link

Linear probing #36

Closed zankerx closed 1 year ago

zankerx commented 1 year ago

Hi, I have a question about linear probing, I haven't seen a CLS token. Is the classification performed directly on all of the outputs (which makes a lot of parameters for a single layer) or on an average of the outputs ? Thx !

MidoAssran commented 1 year ago

Hi @zankerx, that's correct, there is no CLS token, we just average pool the outputs!

TranThanh96 commented 1 year ago

Hi @MidoAssran , I am trying with Vit-B14 for classification task, my output from transformer have shape nx256x768, so what dim should I use average? reduce dim to nx256 or nx768 ?