Now I have extracted the image pre-training features using clip, but I don't know how to choose which layer of mert's features is more suitable for the
Hi, I think the layer selection really depends on the task you are working on. If you are trying to do a CLIP-style training, I would suggest you use the last layer output for a general purpose.
Now I have extracted the image pre-training features using clip, but I don't know how to choose which layer of mert's features is more suitable for the