Qsingle / LearnablePromptSAM

Try to use the SAM-ViT as the backbone to create the learnable prompt for semantic segmentation

Apache License 2.0

77 stars 13 forks source link

Is xFormers a must? #14

Open Heirudy opened 2 months ago

Heirudy commented 2 months ago

Thanks for your great work! I lost a lot at the beginning of the training and the dice metric is approaching 0.At the same time, I use the same dataset as you in your paper.Is it because I didn't have xFormer installed?

Qsingle commented 2 months ago

I think the reason is not that you do not install the former. Could you provide more information?

Heirudy commented 2 months ago

Thanks for your answer.I only trained with the first five images of the FIVES dataset and only changed the num_classes in your code, the two images below are the output of my training. 屏幕截图 2024-06-24 104314 屏幕截图 2024-06-24 104426

Qsingle commented 2 months ago

What is the num_class changed to? Do you divide the mask by 255 to scale the value of the label to 0 or 1?

Heirudy commented 2 months ago

6 Since reading the previous discussion, I changed the num_classes to 255 and still reported the same error, so I changed it to 256 and it ran successfully after that.

Qsingle commented 2 months ago

6 Since reading the previous discussion, I changed the num_classes to 255 and still reported the same error, so I changed it to 256 and it ran successfully after that.

I know the reason. I suggest you divide the label by 255 to scale the value to 0 and 1 and set the num_classes to 2 or 1. You can use the option divide provided by the training script.

Heirudy commented 2 months ago

My question is solved, thank you very much for your patience.

Heirudy commented 2 months ago

According to your suggestion, I changed the num_class to 2, and after the parameter divide was changed to True, the number of color channels pred of the model's prediction result was 2, but this will cause the information to be lost when the prediction result is converted to image output, what should I do? 屏幕截图 2024-06-27 161836

Qsingle commented 2 months ago

You can use the following codes to get the final mask.

pred = model(x)
pred = torch.softmax(pred, dim=1)
pred = torch.max(pred, dim=1)[1] # get the max value and the index, we only need the index of the max value

Heirudy commented 2 months ago

My problem was successfully solved.This is the first deep learning program I have completed, and your patience and help have been invaluable.Words cannot fully express my gratitude and excitement. Thank you once again!