Semantic Segmentation Results on VOC2012 is not good. - Githubissues

ziqi-jin / finetune-anything

Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios

MIT License

766 stars 55 forks source link

Semantic Segmentation Results on VOC2012 is not good. #9

Closed TyroneLi closed 1 year ago

TyroneLi commented 1 year ago

Hi, I use your codebase to do training on voc2012, but the mIoU and visualization results are really poor. Can you provide more details or pretrained checkpoints?

Kalfredwv commented 1 year ago

Excuse me, about this issue #10 ,I wonder what kind of GPU did u use. Since I just have 4 GTX3090s, I am afraid of insufficient GPU's memory.

ziqi-jin commented 1 year ago

Excuse me, about this issue #10 ,I wonder what kind of GPU did u use. Since I just have 4 GTX3090s, I am afraid of insufficient GPU's memory.

I think 4 GTX3090 can run the code. I haven't tested parallel multi-card yet. Maybe there are some problems with my code. I will fix it later. For the current single-card test, please refer to https://github.com/ziqi-jin/finetune-anything/issues/10#issuecomment-1579885258

ziqi-jin commented 1 year ago

Hi, I use your codebase to do training on voc2012, but the mIoU and visualization results are really poor. Can you provide more details or pretrained checkpoints?

Yeah, I found the same problem, Haha, I am finding the reason. (1)The code may have some bugs, (2) The SemanticSAM model may not be suitable for fine-tuning. (3) There are many parameters to be adjusted, e.g., the lr, the modules, and the schedulers... I have not found the right way.

By the way, I think it can be a long quest to do this research and get good results. We can explore the use of SAM together. At present, I hope to find the bugs in my tool and provide an easier-to-use tool, and then users can DIY their own trained models, processes, etc. So if you have any new discoveries during the training process, you are very welcome to submit a PR to this repository.

ziqi-jin commented 1 year ago

Hi, I use your codebase to do training on voc2012, but the mIoU and visualization results are really poor. Can you provide more details or pretrained checkpoints?

The original SAM has points prompt input. I will try to fix this later, the performance may be better

ziqi-jin commented 1 year ago

Excuse me, about this issue #10 ,I wonder what kind of GPU did u use. Since I just have 4 GTX3090s, I am afraid of insufficient GPU's memory.

I have updated the parallel code, now you can try to run the code.

ziqi-jin commented 1 year ago

I am trying to fix the performance problem, If the performance is good, I will release the model weights. If you got a good result model by Finetune Anything, Please submit a PR. so I will close this issue, if you have other questions, pls start a new issue.

summelon commented 11 months ago

Hi, @ziqi-jin Sorry for reopening this thread. I plan to train my custom dataset based on your great work. I just wanna make double-check whether the fine-tuned performance is reasonable now. As you said you will release the model weights if things went well but I did not see it in the README.