ajaichemmanam / simple_bodypix_python

A simple and minimal bodypix inference in python
76 stars 19 forks source link

Not able to get high resolution outputs #10

Closed NiranthS closed 4 years ago

NiranthS commented 4 years ago

How do we get high resolution output as shown on tfjs BodyPix page? In the tfjs page, by changing internal resolution we are able to get a good quality segmentation. What should be done here to get similar results?

benschlueter commented 4 years ago

Interpolate the segmentation mask to "input resolution" and use a lowpass filter (Gauss for example) to make the edges smooth. Then threshold the mask and after apply it :)

benschlueter commented 4 years ago
 # Draw Segmented Output

mask_img = Image.fromarray(segmentationMask * 255)
mask_img = mask_img.resize( (imgWidth, imgHeight), Image.LANCZOS).convert("RGB")
mask_img = mask_img.filter(ImageFilter.GaussianBlur(60))
mask_img = tf.keras.preprocessing.image.img_to_array(
    mask_img, dtype=np.uint8)
mask_img = np.asarray(mask_img)
idx = mask_img[:,:,:] > 127
mask_img = idx.astype(int)
mask_img[idx]=255
mask_img[np.invert(idx)]=0

Thats how I postprocess the mask

NiranthS commented 4 years ago

Thanks a lot, @Kakashiiiiy it worked!! :-)

NiranthS commented 4 years ago

@Kakashiiiiy I saw in one of the issues that you said "I need a scaling factor like in JS with "low medium high and full", in JS I used low(0.25) and it works good if I use "full" the results are similar." Is there any such scaling factor "full" in Python code? I think it is just the input image size we feed into the network right?

benschlueter commented 4 years ago

Yes. Right now we are using "full" in the python implementation. Just define a scale variable just like I did it below. Currently I am looking at how BodyPixJS interpolate the picture. I think I found the JS function and they are using a bilinear interpolation with sigmoid. But before I am able to commit something I want to improve the code further

scale = 0.25
....
targetWidth = (int(scale*imgWidth) // OutputStride) * OutputStride + 1
targetHeight = (int(scale*imgHeight) // OutputStride) * OutputStride + 1
.....
 key_x = int(x_heat * OutputStride/scale) #+x_offset
 key_y = int(y_heat * OutputStride/scale) #+y_offset
NiranthS commented 4 years ago

Oh okay. Thanks :)