shaoyanpan / Synthetic-CT-generation-from-MRI-using-3D-transformer-based-denoising-diffusion-model

This is the repository for the paper published in Medical Physics: "Synthetic CT generation from MRI using 3D transformer-based denoising diffusion model".
https://aapm.onlinelibrary.wiley.com/doi/abs/10.1002/mp.16847
MIT License
12 stars 3 forks source link

Grid-like pattern #4

Open ChristopherSalomons opened 6 days ago

ChristopherSalomons commented 6 days ago

Hi Shaoyan, really appreciate your work here. When trying out the model, all of my predicted images exhibit a grid-like pattern matching the inference sliding window. I'm curious if you also saw this during your own training.

image

I've tested using 20 training images for 500 epochs, as well as 144 training images for 180 epochs, and the pattern appeared for both.

shaoyanpan commented 6 days ago

Here are three options for improving the grids:

  1. In the Jupyter notebook, under the "Build the testing function" section, modify the code as follows: from monai.inferers import SlidingWindowInferer img_num = 12 overlap = 0.5 inferer = SlidingWindowInferer(img_size, img_num, overlap=overlap, mode='constant')

Change the last line to:

inferer = SlidingWindowInferer(img_size, img_num, overlap=0.75, mode='gaussian')

  1. The network might not be fully optimized. Try training for more epochs. In my experience, these patterns diminish significantly with longer training.

  2. The grids appear only in the x-y plane, not in the z-plane, for the MRI-CT task. You can increase the patch size in the x and y dimensions and reduce it in the z dimension.

ChristopherSalomons commented 3 days ago

Thanks, really appreciate the help. I am curious about the reasoning behind the patch-based approach, does using smaller patches/windows provide benefits besides faster training?

shaoyanpan commented 3 days ago

Actually patch-based approach is not desiable, it make your performance worse, slower training and generation. The best way is to fit the whole volume (e.g., 256x256x128) for training.

However, our GPU does not have enough memory to accpet the whole volume, so patch-based approach is necessary to let our GPU can accept the data (64x64x4) then start to train. Depends on your GPU, make the patches as large as possible while maintain some reasonable batch size. For example, if your GPU is very large to accpet 16x128x128x4 volume, you can consider to increase the patch size to 256x256x4 but the batch can reduce to 4. Still good. Then you can start to train.

ChristopherSalomons commented 2 days ago

I see, thanks for the insight.

shsargordi commented 1 day ago

Hi, Christopher

May I ask what your dataset is? Is it a public dataset? I couldn't get good results with my dataset, and I am wondering about your results on your data.

ChristopherSalomons commented 20 hours ago

I am using the SynthRAD2023 dataset, my most recent results are not as good as those reported in Shaoyan's paper, but they are reasonable (~82 MAE).