shaoyanpan / 2D-Medical-Denoising-Diffusion-Probabilistic-Model-

This is the repository for the paper "2D Medical Image Synthesis Using Transformer-based Denoising Diffusion Probabilistic Model".
https://iopscience.iop.org/article/10.1088/1361-6560/acca5c
MIT License
35 stars 9 forks source link

Image Size #5

Open josefapv opened 9 months ago

josefapv commented 9 months ago

Hi,

I am feeding images (216, 256) into the model, and I want it to provide me with synthetic images of the same size. How can I achieve this?

shaoyanpan commented 9 months ago

Hi:

There is a set of code " ResizeWithPadOrCropd( keys=["image"], spatial_size=(256,256), constant_values = -1, ), " already included in the notebook. This one will help you to automatically pad or crop the boundary of the input image so the size will be (256,256).

Haven't had chance to take a look at the MRI yet. Sorry for that.

josefapv commented 9 months ago

Don’t worry about that, it’s not necessary anymore. Regarding the size of the image, I think you misunderstood me. The input image is (216, 256), but I want it to remain the same size and for the model to output an image with the same dimensions.

shaoyanpan commented 9 months ago

Well, because this code will pad all zeros symmetrically to your input images. So personally I will crop the output image also symmetrically so the final image remains [216,256]

josefapv commented 9 months ago

In addition to the line you suggested, I changed all the image size parameters to (216, 256) throughout the code, as shown below.

model = SwinVITModel( image_size=(216,image_size),

with torch.no_grad(): x_clean = diffusion.p_sample_loop(model,(num_sample, 1, 216, image_size),clip_denoised=True)

image_size = 256 img_size = (216,image_size)

ScaleIntensityd(keys=["image"], minv=-1, maxv=1.0), ResizeWithPadOrCropd( keys=["image"], spatial_size=(216,256),

However, now I am encountering the following error, can you please help me solve it. image

shaoyanpan commented 9 months ago

I guess it is because the 216 cannot be divided by a factor of power of 2? Cause we should downsample by 2 from each layer and I remember we have 4 layers