shaoyanpan / Synthetic-CT-generation-from-MRI-using-3D-transformer-based-denoising-diffusion-model

This is the repository for the paper published in Medical Physics: "Synthetic CT generation from MRI using 3D transformer-based denoising diffusion model".
https://aapm.onlinelibrary.wiley.com/doi/abs/10.1002/mp.16847
MIT License
12 stars 3 forks source link

Weird output #2

Open GritYu opened 2 months ago

GritYu commented 2 months ago

Hi Shaoyan, this is really an amazing work. When I try your code for non-conditional tasks, my output looks super weird. Have you ever seen this before? I would appreciate it if you could give me any suggestion!

image

shaoyanpan commented 2 months ago

To be honest, I don't know whether the diffusion model can perform a non-conditional image generation. I think diffusion model for 3D generation is still a big challenge (even though there is one scientific report paper claims they have a 3D model for this task https://www.nature.com/articles/s41598-023-34341-2). For example, really high quality 3D video generation is only achieved by OpenAI's Sora to my best knowledge.

GritYu commented 2 months ago

Thank you so much for your reply!

GritYu commented 2 months ago

@shaoyanpan Hi Shaoyan, I just tried to change it to a conditional task like yours, but I'm working on the 256*256 image. However, I still get this kind of weird output. (I tried the CNN version and Swin version, both have the same weird output) I changed some hyper-parameters to
num_channels = 64 channel_mult = (1, 1, 2, 2, 4, 4) attention_resolutions="64,32,16,8" num_heads=[16,16,32,32,32,32] window_size = [[4,4,4],[4,4,4],[4,4,4],[4,4,4],[4,4,4],[4,4,4]] num_res_blocks = [2,2,1,1,1,1] sample_kernel=([2,2,2],[2,2,2],[2,2,2],[2,2,2],[2,2,2],[2,2,2])

Could you give me any advice about this? Thank you so much! image

shaoyanpan commented 2 months ago

Hi:

I want to make sure that, firstly, this is a 2D non-conditional image generation? Second, could you show me your training loss curve. For example, what is the loss in your current epoch? How many epoch you train?

GritYu commented 2 months ago

Hi:

I want to make sure that, firstly, this is a 2D non-conditional image generation? Second, could you show me your training loss curve. For example, what is the loss in your current epoch? How many epoch you train?

This is 3D conditional image generation. Batchsize=2, epochs=30, steps= 40000, and the loss = 0.0194841

GritYu commented 2 months ago

I plot the output from training_losses() from GaussianDiffusion.py, which shows the same weird image. I plot the same output from your 2D version model, which looks normal and pretty. image

shaoyanpan commented 2 months ago

One potential reason for the issues you're experiencing could be an insufficient number of epochs in your training regimen. For instance, if your dataset contains 1000 3D volumes, consider that your network might need to run the diffusion process for each timestep across these 1000 steps. To achieve comprehensive training, you might require a total of 1000×1000=1,000,000 iterations (1000 epochs), purely as an example. While you typically may not need this many, I believe a minimum of 200 epochs is essential. Additionally, I recommend starting with simpler models to establish a baseline. Because, anyway, this is not design for non-conditional generation. Begin by employing a CNN to generate 2D slices. If successful, this preliminary step confirms that the diffusion process is functioning correctly. Subsequently, you can progress to using a ViT for the next phase, which involves generating 3D volumes. This methodical approach will help streamline the development of your model.

GritYu commented 2 months ago

One potential reason for the issues you're experiencing could be an insufficient number of epochs in your training regimen. For instance, if your dataset contains 1000 3D volumes, consider that your network might need to run the diffusion process for each timestep across these 1000 steps. To achieve comprehensive training, you might require a total of 1000×1000=1,000,000 iterations (1000 epochs), purely as an example. While you typically may not need this many, I believe a minimum of 200 epochs is essential. Additionally, I recommend starting with simpler models to establish a baseline. Because, anyway, this is not design for non-conditional generation. Begin by employing a CNN to generate 2D slices. If successful, this preliminary step confirms that the diffusion process is functioning correctly. Subsequently, you can progress to using a ViT for the next phase, which involves generating 3D volumes. This methodical approach will help streamline the development of your model.

Thank you so much for your advice, Shaoyan. I'll try more experiments, as you suggested. Have a great weekend.