PeterL1n / BackgroundMattingV2

Real-Time High-Resolution Background Matting
MIT License
6.81k stars 950 forks source link

Inaccurate operations of patch crop and replace #125

Closed Dinghow closed 3 years ago

Dinghow commented 3 years ago

Hi, Thanks for your excellent work. I find there are some wrong or inaccurate implementations in your Refiner. First, in crop_patch() function (line 207~209 in ./model/refiner.py), your code use torch.unfold twice to unfold input feature map to a series of sliding windows tensor. However, in your paper & code implementation, the idx Tuple is acquired from a errom map of [H/4, W/4] resolution, where H, W denote the original resolution. And the input feature map to be unfolded is at size [H/2, W/2]. To get the same resolution for patch cropping, you set the unfold stride=2 to make a downsample operation. Nevertheless, the result of x.permute(0, 2, 3, 1) .unfold(1, size + 2 * padding, size) .unfold(2, size + 2 * padding, size) is not [B, H/4, W/4, C, patch_size, patch_size], but [B, (H/2-patch_size)/stride + 1, (H/2-patch_size)/stride + 1, C, patch_size, patch_size] (torch.Tensor.unfold), where H/4-3 is not equal to H/4. And the result is not influenced so much since the padding is only 3, but if the padding value is larger, the error is not negligible.

Dinghow commented 3 years ago

I don't notice the padding operation before unfold, it's my mistake.