Accuracy of the SiNet Portrait Demo Code (Python)

YexingWan / Fast-Portrait-Segmentation

The MNN base implementation of SINet for CPU realtime portrait segmentation

MIT License

63 stars 13 forks source link

Accuracy of the SiNet Portrait Demo Code (Python) #8

Open hualili opened 2 years ago

hualili commented 2 years ago

Hello, Greetings. Thank you for posting your Fast-Portrait-Segmentation code, it is really nice and informative. However, we have tried to run the python demo code. We run: $python cam_seg.py This Python program grabs a frame from WebCam, suppose to delete background, and combine the frame to the image file shown in a window. But I have tested in my room with DNC_SiNET_bi_192_129.mnn. The program did not segment the input portrait. We have been trying to resolve this. Can yo help to answer the following questions: (1) the original PyTorch model and github; (2) also try to print the architecture of the model; (3) any reason why the code does not segment webcam input portrait? Thanks! HL

YexingWan commented 2 years ago

(1) the original PyTorch model and github; (2) also try to print the architecture of the model; (3) any reason why the code does not segment webcam input portrait?

(1) I use SINET as my original model, and the weight which is from this repo.

(2) As the README, you can use netron to inspect the details of given onnx model.

(3) Could you pls provide more detail of how you run the code and how you put the "background image" you want to replace the real background (maybe your room)?

Regards

hualili commented 2 years ago

Hello, greetings, thanks for your response, your response (1) and (2) are helpful and I am going through it now.

I have executed 2 code, one is FaceSeg.cpp and the 2nd is cam_seg.py from the pyFaceSeg folder. The cam_seg.py segmentation most of the time just returns the background image, when place my hand near the camera, about 1-2 feet from the camera, it occasionally returns a irregular segmentation of partial hand which I can not even tell it is a hand. Compare to the cpp code in the same environment, cpp code can perform segmentation, e.g., get the portrait of the human.
I have noticed in your cpp code, line 43 float gamma_tran = 0.6; can you please elaborate what is this parameter for? do you have similar parameter in cam_seg.py? when I tried to increase the value from 0.6 to 1.1, it seems the cpp segmentation works better?
Do you have the cam_seg with the same parameter implementation as FaceSeg.cpp? Thank you so much! best regards

hualili commented 2 years ago

in your cam_seg, you have mean = [107.304565, 115.69884, 132.35703] std = [63.97182, 65.1337, 68.29726] (Line 11 and 12), are these in RGB or HSV? where are these parameters are coming from?
in cam_seg, regarding def postprocess(result, ori_img, h_in, w_in, cut_bg, re=True), you have result[0][0, :, :] += 0.814, what is this parameter for? where is this value coming from ?
in the same postprocess module, idx_fg = cv2.GaussianBlur(idx_fg, (3, 3), sigmaX=2).astype(np.float32) / 255, is this related to cpp code (FaceSeg.cpp) is for GaussianBlur, is this related to "blur_r(2i+1):" (line 56 of the cpp code).
what is your recommended input argument to run cam_seg.py? (args = parser.parse_args() Line 118). Thanks!