milesial / Pytorch-UNet

PyTorch implementation of the U-Net for image semantic segmentation with high quality images
GNU General Public License v3.0
9.1k stars 2.48k forks source link

Using U-Net for Real-time semantic segmentation #283

Open k-nayak opened 3 years ago

k-nayak commented 3 years ago

Hello everyone,

I would like to first thank milesial for such an amazing code and active responses to all the issues. I am using the same code to implement real-time segmentation, i have had 10 images of which i have created binary masks each being of size 1280x960 i have used albumentations for augmentations and have created a data set of 120 images of size 256x256.

I have managed to reach a DIce score of 0.6 - 0.7 but no higher and a loss of 0.4 which is not reducing further. i have tried 10 and 50 epochs so far and nothing seems to have helped. lr=0.0001; batch size=1

Will using augmentation to increase the dataset help or shall i try to increase the number of original images ?

Another problem i am facing is the lag while streaming through the webcam, which is not very smooth althought not exactly accurate with the current DIce score and the loss. If anyone has had the opportunity to work with real-time segmenation i would really appreciate if you could give some advice on what caould be the reason for this issue and also how qi can improve the model accuracy.

Thank you in advance!

milesial commented 3 years ago

Hi,

I think that 10 images is very low. Even with data augmentation, you don't have much information. I would recommend collecting more images (in the hundreds) and applying data augmentation on those so that you get a few thousand images.

About the lag, what GPU are you using? If you have a recent GPU, using mixed precision can help.

You may be interested into things to speed up inference such as TensorRT. You could also try inference mode

k-nayak commented 3 years ago

Hello milesial,

Thank you for such quick response, I am using Nvidia Quadro RTX 3000 6GB on my laptop. My dataset consists of water droplets and i am manually making binary masks as it is difficult to find datasets that match my requirements. I will however continue to create masks further on, thanks for the advice. I will try to use inference in my Real-time segmentation code and see how it goes.

I was thinking that U-Net does not need a large dataset when compared to Mask RCNN and other model, hence the small dataset.

k-nayak commented 3 years ago

In addition to the previous question I would like to know if any one has any suggestions on creating accurate binary masks, i have been using "LabelMe" to create binary masks for detecting water droplets from snaps from a video. Which is rather time consuming if there is any other way or application that can do it faster, I have also used thresholding but that tends to skip some of the droplets which is not good. Anyway hope to get some inputs from anyone who has ever come across such situation.

Thanks in advance!

ZhaoQs999 commented 1 year ago

You can also use "Eiseg" for creating binary masks