plemeri / transparent-background

This is a background removing tool powered by InSPyReNet (ACCV 2022)
MIT License
774 stars 85 forks source link

threshold has negligible effect #43

Closed eafpres closed 9 months ago

eafpres commented 10 months ago

System: WSL 2 in Windows 11, running Ubuntu 20.04 Python 3.9 3080 Ti 12 GB GPU 32 GB RAM

Excellent package. This is the best OSS background removal I have found, thanks.

I have tested using threshold 0.01 and 0.99 to see if I can optimize for my use case. Two pairs of images as example--left is original, right after remover.process() A) Threshold 0..01 image

B) Threshold 0.99 image

Here we see that we are having trouble with light background pavement, and over the full range of threshold cannot meaningfully impact that performance.

Is there anything else I could try with the code as is, or should I consider some form of fine-tuning? Do you support that?

plemeri commented 10 months ago

Hi @eafpres, thank you for choosing our work. Glad to be your help!

First of all, I tried myself with the provided image and here are the results. Threshold 0.5 (default) 0.01 0.99
test2 test2_rgba_threshold_050 test2_rgba_threshold_001 test2_rgba
Saliency Map Only test2_map_theshold_050 test2_map_threshold_001 test2_map_threshold_099

There are different from your results. Can I have your script since I used a command line tool not python script. It seems like you used our python API, so it would be helpful for us to find a problem if you provide your script.

Moreover, we also recommend trying base-nightly mode which can be enabled by using --mode base-nightly argument.

Feel free to ask more question. Thanks.

eafpres commented 10 months ago

There are different from your results

The original image is 3000 x 4000 pixels .jpg I wonder if the large image has some relation to the way the threshold is working? I did do further testing and I can see impact if I make the threshold 1e-4 or 1e-3 in these cases.

It seems like you used our python API, so it would be helpful for us to find a problem if you provide your script.

remover = Remover(mode = 'base-nightly',   device = 'cuda:0')
img = Image.open(data_dir + '/' + file).convert('RGB')
img = ImageOps.exif_transpose(img)  
img_no_bg = remover.process(img, threshold = prediction_threshold)

Note in the code, the ImageOps.exif_transpose() just rotates the image per the meta data in the jpg file

Moreover, we also recommend trying base-nightly mode which can be enabled by using --mode base-nightly argument

Yes, I have been using that from the start as I realize this is in active development!

eafpres commented 10 months ago

I have confirmed that reducing the image size does affect the proper values for the threshold parameter. Reducing to 400 x 400, the value of 0.5 is more in the middle of the effects,.

plemeri commented 10 months ago

That is true. Actually, we trained our model with the fixed size of 1024 x 1024, so it would produce a more stable result if the given image is hard enough to generate a saliency mask such as your example. Salient object detection dataset are mostly center biased, which means, objects are mostly located on the center of the image, and mostly not occluded nor truncated by the image frame. Otherwise, the network would struggle finding the proper region of saliency. Moreover, if the given image is way more larger or smaller than usual, then it would struggle more.

Long story short, I think in your case, you might need to resize the input image into a fixed size like you said 400 x 400.

Also, if you do not need a high quality result in terms of an accurate prediction around the edges of the object, then you might want you use a mode= 'fast' option which automatically resizes the input image into 384 x 384 and it is trained with the same size (384 x384). It also consumes less gpu memory and less computational cost.

Feel free to ask more question. Thanks.

eafpres commented 10 months ago

I've found some interesting cases that seem to largely defeat the algorithm. Are you interested to receive those images for development?

plemeri commented 10 months ago

I am interested to see those images, but currently I'm already graduated, so I cannot access to any GPU machine to train for my own project. Thank you for your offer btw.

eafpres commented 9 months ago

Here are some samples. I'd be grateful for any thoughts you have. In these cases, adjusting threshold makes a change but not enough to get a good result.

image image image image image image image image image image image image image image