WZH0120 / SAM2-UNet

SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation
Apache License 2.0
86 stars 9 forks source link

I don't get how we can run predicitions? #9

Open GXcells opened 2 weeks ago

GXcells commented 2 weeks ago

I don't get how to run segmentation with your scripts. How can the model know what t osegment if we don'T provide any examples of image/segmentation pairs? Do I need first to finetune/train on my dataset for each segmentation task? (for example a specific cell type on histology images, or a specific immunohistochemistry staining, etc...) and then run "test"?

xiongxyowo commented 2 weeks ago

Hi, you need to train on your own datasets first because SAM2-UNet does not have zero-shot capability (the original prompt encoder and decoder of SAM2 is removed).

GXcells commented 2 weeks ago

Ok, I just trained one model on a dataset but now if I want to run predicitons, the only script available is test.py . But there is a groundtruth required argument. How can I thus run a prediction on images that are not yet segmented? Thanks in advance

xiongxyowo commented 2 weeks ago

Hi, our test dataset loads ground truths to make it easier to align the predictions with the shapes of GTs (see here). You can make simple modifications to the code to make the testing process independent of ground truths.

GXcells commented 2 weeks ago

Ok, thanks a lot.

I modified and it is working without goundtruth.

But I am convinced that it would be important that you provide an inference code independent of groundtruth and that yo upeform again your benchmarks with it. Because in real use case, we generally never have groundtruth during inference (if we have ground truth of the images that we want to segment then why segmenting?)

xiongxyowo commented 2 weeks ago

Hi, thank you for the suggestion. We follow the common practice of up-sampling the prediction results to the GTs' size (see test codes in PraNet and FEDER). Since metrics computed at low resolutions can differ from those at the original resolutions, we perform up-sampling to ensure a fair comparison with existing methods. For users who wish to eliminate this logical flaw, we recommend up-sampling the predictions to the input image resolution instead.

xiongxyowo commented 2 weeks ago

Note: The resolution of some test images in public datasets may be different from the corresponding GTs, so this modification may cause some anomalous performance.

GXcells commented 2 weeks ago

Hi, thank you for the suggestion. We follow the common practice of up-sampling the prediction results to the GTs' size (see test codes in PraNet and FEDER). Since metrics computed at low resolutions can differ from those at the original resolutions, we perform up-sampling to ensure a fair comparison with existing methods. For users who wish to eliminate this logical flaw, we recommend up-sampling the predictions to the input image resolution instead.

Thanks for the explanations. I'm a wet lab scientist and don't deeply understand how exactly unet and other machine learning models are working. That is why I was asking question that was more related to a direct implementation of your training for analysis of data that we have in the lab.