ZhengPeng7 / BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
https://www.birefnet.top
MIT License
1.08k stars 84 forks source link

Guiding what to segment #18

Closed hbardak closed 1 month ago

hbardak commented 5 months ago

Hi ! Firstly amazing work ! Not really an issue but more a question.

By reading the white paper, I am not sure if you can choose what to segment rather than just the foreground, (a bit like Segment Anything Model). As I am just an artist, I am not understanding everything. Could you confirm or deny it ?

Best regards,

ZhengPeng7 commented 5 months ago

Hi, thx for your interest :) By now, the target to segment can not be specified. What the model should segment is learnt from the dataset (e.g., salient object detection). However, in my mind, it's easy to do a little modification on the dataset to get boxes of targets, which can be used as a prompt for you to choose which object you want to segment. Do you have that need? If this is really a useful thing to people like you in development, I can also spare some time to try to do that.

hbardak commented 5 months ago

I think that would be useful !

ZhengPeng7 commented 5 months ago

Alright, I'll try to spare some time to give it a try. Updates will be attached here (successful or not), you can wait for it.

hbardak commented 5 months ago

Amazing ! Thank you !

laxmaniron commented 4 months ago

any update on this for using a bounding box as input, just like SAM ?

ZhengPeng7 commented 4 months ago

It's still not, but may come out this week.

hbardak commented 4 months ago

amazing !

ZhengPeng7 commented 2 months ago

Hi there :) I made a colab with box guidance for BiRefNet inference. You can try it now. But now, the box info is manually put into the variable box, which is not user-friendly. I'll make a GUI to obtain the box info by drawings, and process multiple boxes.

laxmaniron commented 2 months ago

Thanks, will check this out.

rishabh063 commented 2 months ago

Very nice , did you train the model again ? and how were you able to get this dataset? @ZhengPeng7

Also make some comparison with SAM to see how the performance ?

rishabh063 commented 2 months ago

Oh i saw the colab code , you are just passing the cropped part , nice hack

ZhengPeng7 commented 2 months ago

Very nice , did you train the model again ? and how were you able to get this dataset? @ZhengPeng7

Also make some comparison with SAM to see how the performance ?

Thanks for the suggestion! I'll make some comparisons between them.

Yeah, I used to want to train a new model with a prompt encoder like what SAM does. But after that, when I discussed with some others, this simple but useful modification came to the mind. So I immediately established this simple demo.

hbardak commented 2 months ago

Thanks !

pred_pil = pred_and_show(box=[666, 250, 1100, 777])

Could you tell us how the coordinate of the box works ? I mean is top left pixel 0,0 ?

hbardak commented 2 months ago

SAM_Birefnet1.json

By the way I have done a comfy ui workflow that use SAM to get the BBOX before passing it to Birefnet

ZhengPeng7 commented 2 months ago

Thanks !

pred_pil = pred_and_show(box=[666, 250, 1100, 777])

Could you tell us how the coordinate of the box works ? I mean is top left pixel 0,0 ?

It's (x1, y1, x2, y2) as I wrote in colab before: 截屏2024-07-27 18 49 14

ZhengPeng7 commented 2 months ago

And thanks for that workflow file, I'll check it today :)

SuyueLiu commented 1 month ago

I used to want to train a new model with a prompt encoder like what SAM does

Thanks for your nice work. Have you tried to train BirefNet with prompt(Box) encoder and how was that?

ZhengPeng7 commented 1 month ago

Unfortunately, I'm in lack of GPUs now. When I have time and enough GPUs in the coming days, I'll still try to do that to see if it brings additional improvement.