The predicted semantic mask has a slight offset.

archershot commented 1 year ago

When I test the new published pretrained model, I found that semantic segmentation results and input images do not match perfectly. I looked through the'tanh_warp'-related processing and found that the coordinates of'grid_sample' might have a slight problem. In the previous procedure, the'align_corners' option is disabled by default, so the sampling coordinates should be (0, n) instead of (0, n-1). So I changed the code a little and found that the effect was significantly improved.

facer.facer.transform.py line 218 & 219 _yy = yy.unsqueeze(0).broadcast_to(batchsize, h, w).to(device) _xx = xx.unsqueeze(0).broadcast_to(batchsize, h, w).to(device)

change to

_yy = yy.unsqueeze(0).broadcast_to(batchsize, h, w).to(device)+0.5 _xx = xx.unsqueeze(0).broadcast_to(batchsize, h, w).to(device)+0.5

Are the above changes reasonable?

archershot commented 1 year ago

Another question, does face_parsing have a resolution limit on the input image?

dstoutamire commented 1 year ago

I also found that that the suggested +0.5 appears to fix the registration.

FacePerceiver / facer

The predicted semantic mask has a slight offset. #13