dimension limit - Githubissues

Thanks for reaching out!

The model was trained on images resized to 128x128 so it performs best at that resolution. I haven't tried 512x512, but in the paper we did evaluate one of the manual scribble datasets (ACDC) and did the user study with images at 256x256 resolution, and the models still performed well.

I have noticed the CNN version of our model (ScribblePrompt-UNet) is more robust to changes in resolution than the SAM architecture version (ScribblePrompt-SAM). With 256x256 images, ScribblePrompt-UNet works best if you downsize the image to 128x128 for inference and then upsample the prediction to 256x256, as opposed to running inference on the 256x256 image directly.

For ScribblePrompt-SAM, it's better to input the 256x256 image without downsizing because the inference code will upsampled the input image to 1024x1024 for the SAM encoder. The SAM decoder outputs predictions at 256x256 resolution and then resizes them to the input image size.

If you have some original resolution CT/MRs handy, you could try them in our huggingface demo. The demo app automatically resizes the image to 128x128 for inference. There's also code for the app here if you would prefer to run the demo locally.

halleewong / ScribblePrompt

dimension limit #4