Hi, I'm wondering if anyone could share some thoughts on whether it is recommended to train the detection/instance segmentation model with stochastic depth on the backbone? E.g. A (224x224, 28M parameters, 4.2G FLOPs) transformer backbone + cascade rcnn, should the backbone part contain stochastic depth when training the detection pipeline?
Hi, I'm wondering if anyone could share some thoughts on whether it is recommended to train the detection/instance segmentation model with stochastic depth on the backbone? E.g. A (224x224, 28M parameters, 4.2G FLOPs) transformer backbone + cascade rcnn, should the backbone part contain stochastic depth when training the detection pipeline?
Thanks a lot!