Closed Steven-jiaqi closed 10 months ago
Hi, you can check in the code how the cct output is taken.
For experiments with netvlad we do not apply any seqpool, and simply feed the tokens to the netvlad layer.
As to which layer of the cct encoder is used, we experimented with it and found the best performing option was to truncate it at 8th layer was ( cli arg --trunc_te 8
)
Thanks for your great work! I have some questions about network structure of CCT-Netvlad. I'm not sure if there is a Seqpooling layer in its structure or not. Which layer in the CCT is connected to the NetVLAD layer? I'm sorry to bother you. Looking forward to your reply!