ZhengPeng7 / BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
https://www.birefnet.top
MIT License
1.29k stars 100 forks source link

why the result of mask obtaining the value "1" in the four corners of the image? #87

Closed luoshuiyue closed 1 month ago

luoshuiyue commented 1 month ago

the result is : 1a9bfdb756bb1d015bcf259443be86b but the four corners of the image obtaining value “1”:

3eaa4aef37fc6c791d4cfe3e731f14a 9344069808f07e8033d85d3201030e6 10c7441a99050a6215a8537d25e8271 8282735c8fe98f3263fd29f81699084

How can I adjust the code so that these values are not generated?

ZhengPeng7 commented 1 month ago

Wow, that's really weird! Thanks for letting me know about this phenomenon. I guess it may be caused by the sampling/resizing? You can take a look at the output tensors without resizing to the size of original images. If you just simply want to remove these pixel values, you can easily set the pixel values of these four corners to 0 manually. Removing the contours with an area < 3 may also help, but is more time-consuming.

Also, I'll check if there is an issue of that with the labels of training data.

luoshuiyue commented 1 month ago

I test the pred_pil = pred_pil.resize(original_size). It seems not caused by resize. I still want to find the reason to correct instead of just changing to 0 in certain cases, because the four corner areas of the image can also be the foreground.

And I would also like to ask, if my picture resolution is 3k, mainly people (with large accessories such as backpacks) standing on the turntable along the height of three cameras to capture, which model is suitable for my scene?

ZhengPeng7 commented 1 month ago

Hi, @luoshuiyue, forgot to tell you that I also tested some images, but the corners do not have the non-zero predictions.

And I just finished the training on BiRefNet_lite-2K, where the higher resolution samples are selected and used for training in 2560x1440. You can try it to see if it's better than the previous one (trained in 1024x1024 but not lite) with my demo, where I added the option of it today.

luoshuiyue commented 1 month ago

Hi, @ZhengPeng7. If I want to run the 2K model locally, what I need to change in config.py to run the BiRefNet_inference.ipynb in tutorials directory. image image image

ZhengPeng7 commented 1 month ago

Since it's "BiRefNet_lite-2K", you need to set the backbone as swin_v1_tiny. Meanwhile, change the task in config.py to General-2K or manually set the size there as (2560, 1440).