Open shashankvkt opened 2 months ago
Hi, please refer to MAE for code-base & instructions. We following everything in MAE, except that we replace the ViT-Encoder with TiTok-Encoder and uses global pooling to obtain the embedding for linear probing.
Thanks for your reply. Will try it out.
I assumed that you might be using the discrete tokens directly as input to the linear layer for linear probing. So basically, for linear probing, you do not use any discrete tokens?
Thanks for your reply. Will try it out.
I assumed that you might be using the discrete tokens directly as input to the linear layer for linear probing. So basically, for linear probing, you do not use any discrete tokens?
I am also trying to reproduce the Linear Probe results of TiTok. I would assume the 12-dimensional features before being quantized were used for global average pooling, according to the reply:
we replace the ViT-Encoder with TiTok-Encoder
If this is true, I would say this is interesting, because in my previous linear probe experiments on another tokenizer, low dimensional features' results are 10x worse than using high dimensional features before being down-scaled to the dimension of the codebook.
Thanks for your reply. Will try it out. I assumed that you might be using the discrete tokens directly as input to the linear layer for linear probing. So basically, for linear probing, you do not use any discrete tokens?
I am also trying to reproduce the Linear Probe results of TiTok. I would assume the 12-dimensional features before being quantized were used for global average pooling, according to the reply:
we replace the ViT-Encoder with TiTok-Encoder
If this is true, I would say this is interesting, because in my previous linear probe experiments on another tokenizer, low dimensional features' results are 10x worse than using high dimensional features before being down-scaled to the dimension of the codebook.
update: high-dimensional features are necessary for linear probe.
Hello,
Thank you for this wonderful work. I was wondering if its possible to maybe share the code for linear probing to reproduce Figure 4(b). I was trying to reproduce it but did not get the desired results. Just to ensure I didn't do a mistake, could you please share the implementation?
Thanks