Network training on color not on features

anilmaddala commented 6 years ago

I am training the network on segmenting deformable objects like people.

The network seems to train on colors (in this case skin tones) rather than features of the object.

The network is detecting backgounds colors which are similar to skin tones, but not actual people.

Is there a way to train the network to emphasize on features? Should I pre-process the data? right now my training input is RGB image and a binary mask with white for object of interest (person) and black for the rest of the image.

jakeret commented 6 years ago

deformable objects like people

interesting... ;-)

As so often in ML there is no definite solution to a problem. Which means you will have to experiment a bit. One thing that might be worth investigating is to transform your RGB data into HSV color space as this tends to work better for computer vision task. Maybe you could even try to use gray scale images

anilmaddala commented 6 years ago

I tried with with different YUV, HSV and grayscale color formats along with color augmentation. The nnetwork still seems to be looking for the shade of color rather than the features. Any thing else I can try?

creatist commented 5 years ago

This question can be found in toy_problem, look the picture above. I think color and brightness are main primary features , texture is one of superior feature. If the network can not distinguish similar color , try do some data augment on those images may be help. If it not works, you can try to increase the filter number or network layer.

wkeithvan commented 5 years ago

My knee jerk response is this sounds like a data issue. Assuming you can get the UNet to work on something like the toy problem, your problem one of the following:

Your data isn't varied enough. If you always have Caucasian people on dark background, then skin colour is a pretty good proxy for segmentation. Make sure that your people have different skin tones, hair colour, clothing colour, and that your background is of different colours. If getting this data is difficult, consider trying some data augmentation techniques to change the colours so that colour is no longer a good proxy.
You don't have enough data. If you don't have enough data, it can be hard for the algorithm to properly generalize away from the first local minimums that make sense. More data is always better, so look into ways to get more training images.

Another idea is to use a pre-trained network that already works on people. I have been messing around with Mask RCNN and have used pre-trained weights since they work quite well. Here is a great YouTube tutorial on setting it up on your end and here is the GitHub repository with all the code. It took me only like an hour to get everything running on my laptop and working with my own personal photos and is really good at detecting people. If you need more categories that the repo provides, consider using transfer learning techniques so you only have to do a little bit of training on the final layers and not start from scratch!

jakeret / tf_unet

Network training on color not on features #207