cwmok / Fast-Symmetric-Diffeomorphic-Image-Registration-with-Convolutional-Neural-Networks

Fast Symmetric Diffeomorphic Image Registration with Convolutional Neural Networks
MIT License
144 stars 35 forks source link

Questions about the loss #7

Closed 1164094277 closed 3 years ago

1164094277 commented 3 years ago

After consulting you last time, I did some attempts, first I resampled the data, changed the image size to (80,96,112) , and used them for training, loss was normal at the beginning of the training, but at 20,000,30000 times the last three items are all zero. I have also tried to adjust λ1,λ2 and λ3 to be (1000,3,0.1or 100,3,0.001) , but the results was not good either, and even more than 50,000 times the loss becomes 20. Later I tried to use normalized data instead of normalization in the code, but the results were the same. In addition, I also tried using the image_A and image_B you provided, but loss4 is also most of the time 0.Some of the data I sent to you by email.

1164094277 commented 3 years ago

Another problem is that you are given in the paper the Lmag and Lreg using L2 norm,but Lmag using the L1 norm in your code?

cwmok commented 3 years ago

Hi @1164094277,

After consulting you last time, I did some attempts, first I resampled the data, changed the image size to (80,96,112) , and used them for training, loss was normal at the beginning of the training, but at 20,000,30000 times the last three items are all zero. I have also tried to adjust λ1,λ2 and λ3 to be (1000,3,0.1or 100,3,0.001) , but the results was not good either, and even more than 50,000 times the loss becomes 20. Later I tried to use normalized data instead of normalization in the code, but the results were the same. In addition, I also tried using the image_A and image_B you provided, but loss4 is also most of the time 0. Some of the data I sent to you by email.

This is weird. I have managed to train on your data last time. Could you please zip the exact code you are using (with built-in normalization enabled and without the dataset) and send it to my email (cwmokab 'at' connect.ust.hk)? Also, could you specify which files you are included in training (e.g.: orig.nii.gz/ norm.nii.gz/ aligned_orig.nii.gz/ aligned_norm.nii.gz)? Let's break it down into details so that we could figure out where the problem is.

Another problem is that you are given in the paper the Lmag and Lreg using L2 norm,but Lmag using the L1 norm in your code?

You are correct. I have updated the manuscript on arXiv. L_mag should be using L1 norm. Thanks for pointing out the error.

cwmok commented 3 years ago

Hi @1164094277 ,

I think you might fall into a common pitfall, which makes the training failed. When you downsampled the data, you must guarantee the background intensity of your input scan is 0 before and after the downsampling. One possible situation is that img_A has background intensity equals to zero before downsampling. However, if you downsample img_A with spline interpolation, e.g.: scipy.ndimage.zoom({{zoom_factor}}, order = default/3), the background intensity will be not equal to zero anymore.

Here is an example of img_A = zoom(img_A, (1, 0.5, 0.5, 0.5)): Screenshot from 2021-08-05 19-53-37 The background intensity is shown on the top-right corner, which is equal to -1.21e-17

Here is a correct example of img_A = zoom(img_A, (1, 0.5, 0.5, 0.5), order=0): Screenshot from 2021-08-05 19-54-23 Again, the background intensity is shown on the top-right corner, which is equal to 0.

Please let me know whether it solves your problem. If the problem persists, feel free to zip the problematic code and send it to my email. No problem can withstand the assault of sustained thinking.

1164094277 commented 3 years ago

@cwmok First I unzip aligned_norm.nii.gz in each folder,then using numpy to sample and normalize them. When I found the loss would be 20, I paused the training and checked the data, I found the data that you provided were in float64 format, so I changed my data format to float64 ( original data type is float32), loss wouldn't be 20, but the last three items is still occur 0. You can download 1. Zip View of my data, preprocessed code, and loss.py of training in both data formats. 1.zip loss

1164094277 commented 3 years ago

@cwmok I have just seen your latest reply and I will try to resample my data in your way.

cwmok commented 3 years ago

@1164094277 I don't see any problem in your preprocessing. To make it stright, here is the SYMNet code SYMNet_OASIS.zip tailor made for the preprocessed OASIS data from https://github.com/adalca/medical-datasets/blob/master/neurite-oasis.md.

Follow this step and you can reproduce the toy experiment ran in my computer.

  1. Unzip SYMNet_OASIS.zip and download a copy of OASIS dataset from https://github.com/adalca/medical-datasets/blob/master/neurite-oasis.md (and unzip it as well).
  2. Go into SYMNet_OASIS/Data and create a folder "OASIS_Adalca"
  3. Put the unziped dataset (neurite-oasis.v.1.0) inside "OASIS_Adalca"
  4. run the training script. python Train_sym_onepass.py

You will observe that the total loss is dropping toward to -1, and -smo will not equal to 0.

image

Hopefully, you can get the same training results as mine.

--------------Edit (more result) image

1164094277 commented 3 years ago

@cwmok My problem is occur after training 20000,30000 times ,and the loss is normal when training hundreds,thousands times. I will try according to your method. Thank you very much for your help!

cwmok commented 3 years ago

Hi @1164094277,

Here is the screenshot at the end of the training, using the exact code I give it to you. You can see that the model will not collapse even in 160000 iterations. Screenshot 2021-08-06 145055

1164094277 commented 3 years ago

@cwmok Thank you very much. I've already started training the model , the loss looks better than before. Thank you again for your help! 1 I remember another question about the Jacobi determinant, and there are codes that have+1, and I want to know which one is right? 1 This is the code for the paper 《FAIM – A ConvNet Method for Unsupervised 3D Medical Image Registration》(https://github.com/dykuang/Medical-image-registration,loss.py) I also see a reason for +1(https://itk.org/Doxygen/html/classitk_1_1DisplacementFieldJacobianDeterminantFilter.html2

cwmok commented 3 years ago

@1164094277 In short, both implementations are correct. It is all about the addition (or not) of the identity transformation. In our implementation, you can see that we added the identity transformation (J = y_pred + sample_grid) to the output displacement before calculating the Jacobian determinant. While in the +1 version, instead of adding the identity transformation to the output, they add +1 to the diagonal elements of the Jacobian, which is equivalent.

1164094277 commented 3 years ago

@cwmok I see, thank you.