Any information on compared with MVANet? - Githubissues

ZhengPeng7 / BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

https://www.birefnet.top

MIT License

1.09k stars 84 forks source link

Any information on compared with MVANet? #23

Closed wang21jun closed 4 months ago

wang21jun commented 5 months ago

What's your superiority compared with the work [Multi-view Aggregation Network for Dichotomous Image Segmentation]?

ZhengPeng7 commented 5 months ago

MVANet is an interesting and good work, of which the results in their paper are even greater than ours on DIS5K. Compared with it, BiRefNet:

has a simpler architecture: MVANet needs to crop a whole image into patches for parallel feature forwarding, which may not be easy to adapt when bs > 1.
does more comprehensive experiments on many HR tasks: DIS, HRSOD, and COD, same architecture to achieve SOTA on these different tasks.
has better community maintenance by enthusiastic contributors from the community and myself to publish more applications (human portrait segmentation, massive training for general object extraction, ...) and many 3rd-party applications, some of which are listed in README.
has a better code framework (in my personal view), containing various plug-and-play modules, training acceleration, a better evaluation process, and backbone options...

ZhengPeng7 commented 5 months ago

BTW, have you run the code of MVANet?

wang21jun commented 5 months ago

Many thanks for your detailed reply. I am currently working on reimplementing the training process for MVANet, and I expect to finish the training process the day after tomorrow. If you're interested, I would be more than happy to discuss it with you and share my progress once it's completed. I will also delve deeper into and try out your work. Thanks again.

ZhengPeng7 commented 5 months ago

You are welcome:) Looking forward to your results of MVANet. I also did the re-training of if but want to know the results reproduced by you.

ZhengPeng7 commented 4 months ago

Hi @wang21jun , got any results?

wang21jun commented 4 months ago

Trained by their given setting, that is : epoch: 80; lr_gen: 1e-5 batchsize: 1 trainsize: 1024 backbone(Swin-B) and pretrain model: swin_base_patch4_window12_384_22kto1k.pth training set: DIS5K-TR, evaluation set: DIS5K-VD I got the following results: Smeasure: 0.877 meanEm: 0.888 wFmeasure: 0.803 maximal Fmeasure: 0.872 MAE: 0.046

The obtained results demonstrate a certain level of dissatisfaction, requiring further examination. I am also attempting to retrain your BiRefNet model with the Swin-L backbone, which is anticipated to be completed by tomorrow.

ZhengPeng7 commented 4 months ago

Thx, that's a long process. Looking forward to hearing the results of retrained BiRefNet from you, too!

wang21jun commented 4 months ago

Trained by 2 A100-80G with this script: './train_test.sh DIS 0,1 0'.(keep self.batch_size=4, Swin-L as backbone): Smeasure: 0.885 meanEm: 0.92 wFmeasure: 0.838 maximal Fmeasure: 0.877 MAE: 0.041 Although I was unable to reproduce the exact results of the paper, this is the best outcome I was able to achieve after trying these works: IS-Net, SegRefiner, MVANet, BiRefNet and so on. Taking into account the costs related to both training and inference, I will diligently explore several effective strategies to optimize the process. For instance, I will examine whether training the model for 600 epochs is truly necessary or if a reduced number of epochs could achieve satisfactory results. Additionally, I will consider the feasibility of switching the backbone to Swin-B, among other potential modifications, to further enhance the efficiency and performance of the process.

ZhengPeng7 commented 4 months ago

Glad to see your results! Did you run python gen_best_ep.py to select the best ckpt? The default setting is for training on 8*A100-80G, especially the learning rate (half it for 2*A100-80G). In the following days, I'll also try some tricks, like half-precision training.

BTW, if you want to save some time in evaluation, you can turn off the calculation of some metrics as given in this line. Looking forward to hearing good results from you on the experiments you want to do.

ZhengPeng7 commented 4 months ago

Sure.

wang21jun commented 4 months ago

trained by 2*A100-80G，with lr=3e-5，valid on DIS-VD， new results: maxFm=0.902；wFmeasure=0.861；MAE=.035；Smeasure=0.906；meanEm=0.935；HCE=1057

ZhengPeng7 commented 4 months ago

Wow, that's great! Even kind of better than my training on A100-80G x8. There seems to be still some space for improvement by adapting the hyper-parameters. Thanks!

wang21jun commented 4 months ago

Also provide a result trained by swin-b，with lr=3e-5，bs=6: maxFm=0.897；wFmeasure=0.857；MAE=.037；Smeasure=0.903；meanEm=0.944；HCE=1060

ZhengPeng7 commented 4 months ago

Thanks for your updates! I've also spared time and GPUs to train BiRefNet with almost all quantity levels of backbones. The results and weights have been uploaded to the google drive folder. Your results are similar to mine.