wuhuikai / FastFCN

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.
838 stars 148 forks source link

why i remove JPU,I also can train model? #47

Closed E18301194 closed 5 years ago

E18301194 commented 5 years ago

Why does the code still execute without error when I delete the JPU module?(/FastFCN/encoding/nn/customize.py),I also can train model? These are my commands :(I did load the JPU module) CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --dataset pcontext --model encnet --jpu --aux --se-loss --backbone resnet101 --checkname encnet_res101_pcontext

wuhuikai commented 5 years ago

Have you reinstalled FastFCN by python setup.py install?

E18301194 commented 5 years ago

Thank you very much. when i reinstalled python setup.py install ,FastFCN have a error so ,what is (python setup.py install) function?

wuhuikai commented 5 years ago

It did sth like pip install, which copies all the src code into the directory of python libs. Then, our script in experiment can import the corresponding lib such as encoding. Thus, all your modification in the folder encoding will not work unless you reinstall it. If you don't want to reinstall every time, just run python setup.py develop.

E18301194 commented 5 years ago

Thank you very much for your positive response and for providing connection too.Now,I have a question. According to the tips of the paper, I set the following parameter as follow: CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ade20k --model encnet --jpu --aux --se-loss --backbone resnet101 --checkname encnet_res101_ade20k_train but But I can't surface the accuracy of the paper of ade20k (Val and test) .in paper,the val is pixACC:80.99 miou:44.34 in resnet101 Can you tell me how to set the hyperparameters and the accuracy of the paper?

wuhuikai commented 5 years ago

Following the instructions in README.md should lead you to the performance in our paper.

E18301194 commented 5 years ago

`class SegmentationLosses(CrossEntropyLoss): """2D Cross Entropy Loss with Auxilary Loss""" def init(self, se_loss=False, se_weight=0.2, nclass=-1, aux=False, aux_weight=0.4, weight=None, size_average=True, ignore_index=-1): super(SegmentationLosses, self).init(weight, size_average, ignore_index) self.se_loss = se_loss self.aux = aux self.nclass = nclass self.se_weight = se_weight self.aux_weight = aux_weight self.bceloss = BCELoss(weight, size_average)

def forward(self, *inputs):
    if not self.se_loss and not self.aux:
        return super(SegmentationLosses, self).forward(*inputs)
    elif not self.se_loss:
        pred1, pred2, target = tuple(inputs)
        loss1 = super(SegmentationLosses, self).forward(pred1, target)
        loss2 = super(SegmentationLosses, self).forward(pred2, target)
        return loss1 + self.aux_weight * loss2
    elif not self.aux:
        pred, se_pred, target = tuple(inputs)
        se_target = self._get_batch_label_vector(target, nclass=self.nclass).type_as(pred)
        loss1 = super(SegmentationLosses, self).forward(pred, target)
        loss2 = self.bceloss(torch.sigmoid(se_pred), se_target)
        return loss1 + self.se_weight * loss2
        pred1, se_pred, pred2, target = tuple(inputs)
        se_target = self._get_batch_label_vector(target, nclass=self.nclass).type_as(pred1)
        loss1 = super(SegmentationLosses, self).forward(pred1, target)
        loss2 = super(SegmentationLosses, self).forward(pred2, target)
        loss3 = self.bceloss(torch.sigmoid(se_pred), se_target)
        return loss1 + self.aux_weight * loss2 + self.se_weight * loss3`

Thank you for your reply.the Se-loss here, I don't know what pred1, pred2 and se_pred are exactly? I didn't find out exactly where they were generated? I read the paper carefully and found that I did not understand this.Can you tell me what these mean?

wuhuikai commented 5 years ago

Please read the EncNet paper for understanding what SE loss is. See here for all the outputs.

E18301194 commented 4 years ago

thx for you code. I want to use a module to you code(fastfcn).but my module is in pytorch(0.4.0).I don't know how to change fastfcn apple for pytorch(0.4.0)?Thank you very much for your reply.

wuhuikai commented 4 years ago

You can directly plug your module (0.4.0) in FastFCN (1..) without any modification. PyTorch 1.. can run code in 0.4.0 with few changes.

E18301194 commented 4 years ago

Thank you very much for your reply.Because I want to change is the convolution kernel, so you need to use the torch.util.ffi, but the code we wrote this in pytorch 0.4.0 using C language to write, is the use of C + + written in pytorch version 1.0, version can't compatible. I can only give my module into version 1.0, but I'm not familiar with C + + language, so I want to change your code to 0.4.0 version.Could you help me, please?Looking forward to your reply

wuhuikai commented 4 years ago

If you only want to use JPU module, I think no modification is needed. However, if you want to use sync_bn, there's a huge work to do fot adapting it into 0.4.0.

E18301194 commented 4 years ago

I want to know what sync_bn is, please?It's different from a regular bn, why do you use sync_bn?

wuhuikai commented 4 years ago

bn calc running_mean and running_std per GPU while sync_bn calc among all GPUs. Thus, sync_bn means more stable statics.

E18301194 commented 4 years ago

Hi, I don't see label in dataset pcontext, I would like to ask how the label in dataset pcontext is loaded?I see a trainval_merged. Json file that I don't quite understand.

wuhuikai commented 4 years ago

See here.

E18301194 commented 4 years ago

Your work has helped me immensely. I would like to ask, if I want to increase speed, how should I improve it? My current idea is to replace resnet with shufflenetV2, do you have any other suggestions? ,hope to get your reply

wuhuikai commented 4 years ago

One simple method is to prune the model.