I'm doing segmentation on very large images, which produce big flop even with your b0 scale. I decide to scale backbone to smaller one. IMO segformer can go smaller for some downstream tasks.
But I believe your ImageNet pretrain is not done by some regular recipe like a ResNet one. Could you share hyperparameter details on MIT pretrain ?
I'm doing segmentation on very large images, which produce big flop even with your b0 scale. I decide to scale backbone to smaller one. IMO segformer can go smaller for some downstream tasks. But I believe your ImageNet pretrain is not done by some regular recipe like a ResNet one. Could you share hyperparameter details on MIT pretrain ?