Closed David-19940718 closed 10 months ago
Please double check that every parameter was correctly loaded to your segmentation model, since we use the ResNet50 weights in timm
's style, not torchvision
's style. And I got two advice for you:
you may need to adjust learning rate or drop path rate.
you could add a trick, the layer-wise learning rate decay, to your finetuning codebase. Details: refer this to implement the trick, for example, Mask RCNN use a ResNet50 as 4 stages, and when doing finetuning we use $r \times$ the learning rate for the last stage, $r^2\times$ for the second last, $r^3\times$ for the third last, and $r^4\times$ for the first, where $0\le r\le 1$. And we use $1 \times$ learning rate on those not-pretrained parameters like RoI heads.
请问最终有尝试成功吗,我在检测任务上使用spark的resnet权重效果也不如原生权重,有什么方法解决这个问题了嘛
您好,我这边尝试在医学图像分割任务上应用 SparK 预训练过的 ResNet50 权重,虽然比没有加载预训练权重的情况下能涨点,但遗憾的是,与原生 Pytorch 提供的预训练权重相比,在保持完全一致的超参数情况下,精度差很多(Dice: 0.8044 vs 0.8804)。请问这会是什么原因,还是说应用到下游任务时,对某些超参数比较敏感,需要精调?
Hello, I'm trying to apply pre-trained weights of ResNet50 in SparK to a medical image segmentation task. While there is some improvement compared to not using pre-trained weights, it is, however, when compared to the pre-trained weights provided by native PyTorch with exactly the same hyperparameters, the accuracy is significantly lower (Dice: 0.8044 vs. 0.8804). Could you please help me understand the reason for this difference? Is it possible that when applied to downstream tasks, certain hyperparameters are more sensitive and require fine-tuning?