Hi,
I'm trying to train patchcore using transformer backbones (VIT_swin_base and ViT_r50), and firstly trying to figure out which layers to do the feature extraction from, I tried to use blocks.20 for ViT_r50, and layers.3 for vit_swin_base. I keep getting sizes errors in both of them, one requires different size of tensor, and the other require 4D tensor.
Can you please provide an example on how to use and which layers for different types of backbones?
Thanks!
Hi, I'm trying to train patchcore using transformer backbones (VIT_swin_base and ViT_r50), and firstly trying to figure out which layers to do the feature extraction from, I tried to use blocks.20 for ViT_r50, and layers.3 for vit_swin_base. I keep getting sizes errors in both of them, one requires different size of tensor, and the other require 4D tensor. Can you please provide an example on how to use and which layers for different types of backbones? Thanks!