isht7 / pytorch-deeplab-resnet

DeepLab resnet v2 model in pytorch
MIT License
602 stars 118 forks source link

Fine tuning on a smaller GPU #7

Closed MyVanitar closed 7 years ago

MyVanitar commented 7 years ago

Hi,

I have a datase with 2 classes in VOC format. as I realized, you have prepared the fine tuning by some flags. Correct me if I'm wrong.

Besides, my GPU is GTX1060 with 6G memory. Does the calculated memory consumption belongs to the full 21 classes original VOC? I mean can I train the model on this small dataset?

isht7 commented 7 years ago

I do not think it would be possible to train pytorch-deeplab-resnet on a 6 GB memory GPU. Try changing this line to scale = 1.0, effectively disabling the scale augmentation. You can by doing this, reduce the memory requirement but, I think that it would be highly unlikely that it goes below 6 GB. I think that if you keep scale even lower than 1.0, the net will occupy lower memory, but performance will be affected. Note that if you do this, you should keep the same scale while testing later also.

MyVanitar commented 7 years ago

I have to test it, maybe lowering image resolution (width and height) and also number of batches could reduce the memory consumption, because I don't use the VOC originally.

Fine-Tuning with different number of classes and different dataset (but with VOC style) is possible?

isht7 commented 7 years ago

okay, reducing image size might help you, and yes it is possible, instructions are in the readme(look at the flag --NoLabels)

MyVanitar commented 7 years ago

Thank you. Another question is that the DeepLabV2 on the VOC dataset has achieved a Mean-IoU near 79%. Why we can not achieve this?

isht7 commented 7 years ago

The 79% accuracy is on the test set, where train + val set is used for training. Here, we train using only the train set and use the val set for evaluation. For this they show that they have best accuracy of 76.35% as seen in table 4 of the paper. But I am getting about 74.39%. I think this might be due to the fact that I am merging the boundary label with background in the ground truth images during testing. If you are able to 76.35% yourself as reported using train_iter_20000.caffemodel, please let me know also.

MyVanitar commented 7 years ago

Actually I had tested the DeepLabV2 on a Tensorflow repo which it was claiming the same accuracy on the VOC competition (79.7%), but in practice I had many doubts about this number because my heavy FCN-8s was segmenting better!.

Also it had problems with detecting small objects which must not be for such a model with this good results. That's my only experience with DeepLabV2 and I could not test its original Caffe version.

MyVanitar commented 7 years ago

also I'm agree with you about ignoring label-255, which has shown with white color in the PNG labels. on any model which I trained on VOC, ignoring this layer has led to the significant increase in accuracy.

isht7 commented 7 years ago

okay, thank you for your feedback, I will consider ignoring label-255 in the future.

isht7 commented 7 years ago

Also, if you just want to train on a 6 GB gpu, try using only one scale of deeplab-resnet(right now 3 scales are used, as you can see in the here). I think that may fit in 6 GB of memory. Performance will be lower, but you can set it up, with minimal change in code. Also remember to disable the scale augmentation.

MyVanitar commented 7 years ago

Thank you very much gentleman.

isht7 commented 7 years ago

Thank you! Feel free to ping me if you need help in modifying the code. You might want to have a look at the last comment on #5 it might be of significance to you. I will change eval scripts as suggested by the commenter soon.

MyVanitar commented 7 years ago

Thank you.

This is the tensorflow repo which I told you.

https://github.com/DrSleep/tensorflow-deeplab-resnet

He claims even 80% in accuracy but really I was not happey with the results and even if you look at the segmented cats in the examples, borders must be much much sharper than that. but anyway he has ignored label-255. You are way beyond professional than me so I have no suggestion but I contribute and test if I can.