ZhengPeng7 / BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
https://www.birefnet.top
MIT License
991 stars 77 forks source link

Releated to finetunning #53

Closed Kaustubh-cpu closed 3 weeks ago

Kaustubh-cpu commented 1 month ago

@ZhengPeng7

Great work @ZhengPeng7 Actually i am facing 2 issue with the code 1 . I have trained the model with the backbone of swin_tiny on a sample of dataset but while inference i was expecting black and white mask but i am getting some thing like given below image . 2 . Can you please suggest me which pretrained model should i used to test or finetune on human images .What ever pretrained i am using from the google drive it is giving the error of shape or key mismatch or unable to load checkpoint .....so to test it on Pretrained what configuration do i need to change in the Config 1#Accessories#1#Bag#3660693333_61004a731d_o

Kaustubh-cpu commented 1 month ago

@ZhengPeng7
If there is any solution please suggest me for this

ZhengPeng7 commented 1 month ago

Hi, @Kaustubh-cpu; I read your issue last week, but something interrupted me, and I forgot it... But to be honest, could you correct the spelling next time😂 No offense, but that's better. I might've read and understood your question and replied before. But be relaxed, I'm always happy to do some help.

  1. Could you share your source image with me? That seems a weird result.
  2. For human seg, I would suggest using the BiRefNet-portrait, which was trained on many related datasets. You can find all my models on my Hugging Face page or choose the model in my HF space demo to have an easy try on your samples. For the problem of key mismatching, I want you to know that the default backbone is SwinL instead of SwinT, which you used before. If the mismatching is something else, please show me and I'll look deeper into it.
Kaustubh-cpu commented 1 month ago

1#Accessories#1#Bag#3660693333_61004a731d_o

image is also available in DIS dataset in folder DIS-TR (1#Accessories#1#Bag#3660693333_61004a731d_o.png)

Kaustubh-cpu commented 1 month ago

@ZhengPeng7

RuntimeError: Error(s) in loading state_dict for BiRefNet: Unexpected key(s) in state_dict: "squeeze_module.0.dec_att.aspp1.bn.weight", "squeeze_module.0.dec_att.aspp1.bn.bias", "squeeze_module.0.dec_att.aspp1.bn.running_mean", "squeeze_module.0.dec_att.aspp1.bn.running_var", "squeeze_module.0.dec_att.aspp1.bn.num_batches_tracked", "squeeze_module.0.dec_att.aspp_deforms.0.bn.weight", "squeeze_module.0.dec_att.aspp_deforms.0.bn.bias", "squeeze_module.0.dec_att.aspp_deforms.0.bn.running_mean", "squeeze_module.0.dec_att.aspp_deforms.0.bn.running_var", "squeeze_module.0.dec_att.aspp_deforms.0.bn.num_batches_tracked", "squeeze_module.0.dec_att.aspp_deforms.1.bn.weight", "squeeze_module.0.dec_att.aspp_deforms.1.bn.bias", "squeeze_module.0.dec_att.aspp_deforms.1.bn.running_mean", "squeeze_module.0.dec_att.aspp_deforms.1.bn.running_var", "squeeze_module.0.dec_att.aspp_def

Getting this error when i am trying to do the inference on (BiRefNet-portrait-epoch_150.pth)

ZhengPeng7 commented 1 month ago

Are you sure that your network architecture settings in config.py are the same as those in my default ones?

And about the sample above, I tested it with the weights for general use to obtain the result: 截屏2024-08-12 21 31 32

And also with the weights for portrait segmentation (though it's a bag): 截屏2024-08-12 21 32 42

These two results are surely different, but both are reasonable. How did you get your weird results above?