Ryo-Ito / brain_segmentation

Implementation of VoxResNet for 3D brain segmentation
MIT License
66 stars 17 forks source link

Do we need to pre-processing (such as skull removal) before training? #2

Closed John1231983 closed 7 years ago

John1231983 commented 7 years ago

Hello Ryo-Ito,

I have download the MRBrainS13DataNii dataset. It has two folders: Testing and Training, in which only training has ground truth. In training folder, it contains 5 persons named by id from 1 to 5. Let's go to first folder with id 1 as example T1.nii, T1_1mm.nii, T1_IR.nii, T2_FLAIR.nii, LabelsForTesting.nii, LabelsForTraining.nii where T.nii is raw image and Label.nii is label image. I have some question about it

  1. In raw images, it contains skull region, while the label images does not. Do you need to remove these skull regions before training?
  2. Could you show the detail of dataset_train.csv? I guess it contains path_raw_file path_ground_truth_file
  3. Which label do you use as ground truth for training: LabelsForTesting.nii or LabelsForTraining.nii?

Thank you so much

Ryo-Ito commented 7 years ago
  1. In fact, I did not test my implementation on the MICCAI MRBrainS challenge data used by the original paper. I tested it on IBSR (Internet Brain Segmentation Repository), which contains 18 skull stripped T1 brain images and their corresponding label images. Since it doesn't have other modalities, I did not test Auto-context VoxResNet.

  2. dataset_train.csv contains the followings:

    • "preprocessed" column; paths to preprocessed brain images.
    • "segTRI" column; paths to label images
    • "mask" column; paths to mask image indicating brain regions.
  3. Following the original paper, I used 5 brain images for training and randomly sampled training patches whose sizes are 80x80x80. Though I used different dataset, I obtained similar result for dice coefficient shown in the first row of table 1 in the paper.

John1231983 commented 7 years ago

Thank you for your information. I downloaded the IBSR_V2.0 skull-stripped NIfTI dataset and located in /media/john/Study/databaseSeg/IBSR_nifti_stripped. Have you share your dataset_train.csv? I will based on this to modify my path.

Ryo-Ito commented 7 years ago

my dataset_train.csv looks like the table below,

index segTRI seg mask preprocessed
01 IBSR_01_segTRI_isotropic.nii.gz IBSR_01_seg_isotropic.nii.gz IBSR_01_mask_isotropic.nii.gz IBSR_01_preprocessed.nii.gz
John1231983 commented 7 years ago

Thanks. I downloaded the dataset at https://www.nitrc.org/frs/?group_id=48 with name is IBSR_V2.0 skull-stripped NIfTI. It does not has isotropic file. Do you use some software to pre-processing it? The files in a folder of dataset as

IBSR_10_ana_brainmask.nii.gz  IBSR_10_seg_ana.nii.gz
IBSR_10_ana.nii.gz            IBSR_10_segTRI_ana.nii.gz
IBSR_10_ana_strip.nii.gz      IBSR_10_segTRI_fill_ana.nii.gz

screenshot from 2017-02-10 15 18 17

Ryo-Ito commented 7 years ago

Yes, I resliced the brain images by myself to make them isotropic.

John1231983 commented 7 years ago

I am using MIPAV tool and I did not reslice it. I just use the original data. When I ran it, I got the error as

Namespace(display_step=1000, gpu=-1, input_file='dataset_train.csv', iteration=100000, learning_rate=0.001, n_batch=1, out='vrn.npz', shape=[80, 80, 80], weight_decay=0.0005)
Traceback (most recent call last):
  File "train.py", line 82, in <module>
    main()
  File "train.py", line 62, in main
    scalar_img, label_img = load.sample(train_df, args.n_batch, args.shape)
  File "/home/john/brain_segmentation/load.py", line 66, in sample
    scalar_patch = scalar_patch.transpose(3, 0, 1, 2)
ValueError: axes don't match array

I printed the shape of scalar_patch is (80,80,80). My python is Python 2.7.6. Do you think the issue is because I did not resliced? If it is possible, could you share the script or tool to reslice IBSR image

Ryo-Ito commented 7 years ago

It has nothing to do with reslicing. It's because the program expect input image with shape (xlen, ylen, zlen, n_channels) (eg. (80, 80, 80, 2)). The original paper concatenated the original brain image and CLAHE applied brain image so that the input shape will be (xlen, ylen, zlen, 2).

John1231983 commented 7 years ago

Is it 6 or 2? The paper mentioned that m=6. Currently, I am using MIPAV to concatenate the processed data and original data to new raw data, called IBSR_*_ana_strip_preprocessed.nii. I also print the result in load.py as follows:

('scalar_img', (256, 128, 256, 2))
('label_img', (256, 128, 256))
('mask_img', (256, 128, 256))
('scalar_patch', (80, 80, 80, 2))
('label_patch', (80, 80, 80))
...
('x_train',(1, 2, 80, 80, 80))
('y_train',(1, 80, 80, 80))

Note that, I only concatenate the raw data (no for label). I ran it and got the issue as. Do you meet it before? I installed chainer from sudo CUDA_PATH=/usr/local/cuda pip install chainer Sorry about my inconvenience

Traceback (most recent call last):
  File "/home/j/PycharmProjects/brain_segmentation/train.py", line 82, in <module>
    main()
  File "/home/j/PycharmProjects/brain_segmentation/train.py", line 66, in main
    outputs = vrn(x_train, train=True)
  File "/home/j/PycharmProjects/brain_segmentation/model.py", line 76, in __call__
    h = self.conv1a(x)
  File "/usr/local/lib/python2.7/dist-packages/chainer/links/connection/convolution_nd.py", line 83, in __call__
    use_cudnn=self.use_cudnn, cover_all=self.cover_all)
  File "/usr/local/lib/python2.7/dist-packages/chainer/functions/connection/convolution_nd.py", line 350, in convolution_nd
    return func(x, W)
  File "/usr/local/lib/python2.7/dist-packages/chainer/function.py", line 189, in __call__
    self._check_data_type_forward(in_data)
  File "/usr/local/lib/python2.7/dist-packages/chainer/function.py", line 280, in _check_data_type_forward
    type_check.InvalidType(e.expect, e.actual, msg=msg), None)
  File "/usr/local/lib/python2.7/dist-packages/six.py", line 718, in raise_from
    raise value
chainer.utils.type_check.InvalidType: 
Invalid operation is performed in: ConvolutionND (Forward)

Expect: in_types[0].dtype.kind == f
Actual: i != f
Ryo-Ito commented 7 years ago

Though I've never seen that error message before, It seems like your image data type is not float. Have you confirmed it? The chainer expect the input data dype to be float32.

John1231983 commented 7 years ago

Thank you so much. It is data type issue. When I used MIPAV and concatenate the processed image with original image, its type defaults Short Type. I converted to Float type and it worked well. Only one remain issue is CPU and GPU mode. I ran with CPU mode but it does not have enough memory. I switched to GPU but it has some error for CUDA path. I will fix it and let you know

RuntimeError: CUDA environment is not correctly set up
(see https://github.com/pfnet/chainer#installation).CuPy is not correctly installed. Please check your environment
John1231983 commented 7 years ago

Finally, under your advice, I can run your code successfully in GPU with some output as bellows:

Namespace(display_step=1000, gpu=0, in_channels=2, input_file='dataset_train.csv', iteration=100000, learning_rate=0.001, n_batch=1, n_classes=4, out='vrn.npz', shape=[80, 80, 80], weight_decay=0.0005)
step     0, accuracy_c1 0.20, accuracy 0.20, cost 6.93154
step  1000, accuracy_c1 0.16, accuracy 0.16, cost 6.93154
step  2000, accuracy_c1 0.52, accuracy 0.52, cost 6.93154
step  3000, accuracy_c1 0.51, accuracy 0.51, cost 6.93154

Based on this, I have last three questions that need your help

  1. In your default setting, in_channels=2 corresponding to the concatenation of original image and pre_processed image. Is it possible to use in_channels=1 (without considering concatenation, just original image only) or in_channels=6 (as paper way)
  2. I am using MIPAV for histogram equalization. It can support .nii type but it does not have the CLAHE method. Which tool do you use for processing .nii file with CLAHE?
  3. The cost did not change, although step is over 3000. Does it run as expected?

Thanks for valuable comment

Ryo-Ito commented 7 years ago
  1. Yes, it is possible to use in_channels=1 just by python train.py --in_channels 1. I once tried that setting (only normalized original image) and got about 90% accuracy for training images.
  2. In fact since I could not find CLAHE for 3d image, I applied Adaptive Histrogram Equalization, which is implemented in a package called SimpleITK.
  3. I've never seen accuracy stopping around 50%. I always set the iteration to be 10000 and reached more than 90% accuracy after about 1000 steps. Did you normalize the original image as specified in the paper?
John1231983 commented 7 years ago
  1. I regard to inchannels=1 because I want to ignore the affect of per-processed steps such as CLAHE. When I ran with setting --inchannesl 1 , I need to change the image path of preprocessed in the dataset_train.csv file to the original data path as IBSR_02_ana_strip.nii (with Normalized [0 1], number of channel =1). However, I got the error in the load.py
    scalar_patch = scalar_patch.transpose(3, 0, 1, 2)
    ValueError: axes don't match array

    (Note that,scalar_patch's shape is (80,80,80) now due to using original image). You can check one example raw file which processed at here

  2. I tested with normalized image over 90000 iterations but the result does not change. I think it is from concatenation images, hence, I will test with in_channels=1 to ignore the dependence (after fix the above error)
Ryo-Ito commented 7 years ago

The normalization you applied was different from my way. I referred to it as zero mean and unit variance, whereas your normalization was to set intensity range in between [0, 1]. Also your preprocessed image had a shape of (256, 128, 256), but my program expects image shape to be (256, 128, 256, 1).

John1231983 commented 7 years ago

I see. Because I assume that in_channels=1 means it load original image with size (256,128,256). Based on your comment, I added the below code to normalize and convert size to (256,128,256,1) (from original .nii image) as your expected after line #56 of load.py. However, the accuracy is still not same as your result and loss does not change. Is my implement correct?

#Pre-processing: Normalize
for index_slice in range(0, 255):
    image_index = scalar_img[:, :, index_slice]
    scalar_img[:, :, index_slice] = (image_index - np.mean(image_index)) / (np.std(image_index) + 1e-10) #igore to div at zero
# Concatenation from (256,128,256) to (256,128,256,1)
scalar_img = [scalar_img
        for i in range(1)]
scalar_img=np.stack(scalar_img, axis=3)
#end pre-processing
Ryo-Ito commented 7 years ago

I normalized the intensities of whole 3d image, not one slice by one.

# normalization
scalar_img = (scalar_img - np.mean(scalar_img)) / np.std(scalar_img)

#expanding axis from (256, 128, 256) to (256, 128, 256, 1)
scalar_img = scalar_img[:, :, :, None]
John1231983 commented 7 years ago

Thank you. I am running your modification. But the output looks same as above. Do you think isotropic step is importance. I did not apply this step. This is results of first 1000 iterations

step     0, accuracy_c1 0.06, accuracy 0.06, cost 6.93154
step   100, accuracy_c1 0.53, accuracy 0.53, cost 6.93154
step   200, accuracy_c1 0.14, accuracy 0.14, cost 6.93154
step   300, accuracy_c1 0.35, accuracy 0.35, cost 6.93154
step   400, accuracy_c1 0.24, accuracy 0.24, cost 6.93154
step   500, accuracy_c1 0.56, accuracy 0.56, cost 6.93154
step   600, accuracy_c1 0.00, accuracy 0.00, cost 6.93154
step   700, accuracy_c1 0.03, accuracy 0.03, cost 6.93154
step   800, accuracy_c1 0.24, accuracy 0.24, cost 6.93154
step   900, accuracy_c1 0.04, accuracy 0.04, cost 6.93154
Ryo-Ito commented 7 years ago

Honestly, I don't think the isotropic step is important. I think I got decent result with anisotropic images in the past. In my opinion, it is awkward that your standard outputs always have the same value for "accuracy_c1" and "accuracy". I think something related to there has problems. In fact, testing the code shouldn't take a long time as it reaches around 80% after 20 steps. Here is my result with almost the same setting as yours, only difference is that my images are isotropic.

$ cat test.csv
index,segTRI,mask,preprocessed
01,/path/to/dataset/IBSR_01_segTRI_iso.nii.gz,/path/to/dataset/IBSR_01_mask_iso.nii.gz,/path/to/dataset/IBSR_01_ana_normalized.nii.gz
$ python train.py -i 100 -s 1 -g 0 -f test.csv --in_channels 1
Namespace(display_step=1, gpu=0, in_channels=1, input_file='test.csv', iteration=100, learning_rate=0.001, n_batch=1, n_classes=4, out='vrn.npz', shape=[80, 80, 80], weight_decay=0.0005)
step     0, accuracy_c1 0.04, accuracy 0.04, cost 6.93154
step     1, accuracy_c1 0.06, accuracy 0.37, cost 6.9081
step     2, accuracy_c1 0.03, accuracy 0.05, cost 6.79388
step     3, accuracy_c1 0.65, accuracy 0.34, cost 6.84921
step     4, accuracy_c1 0.61, accuracy 0.56, cost 6.13297
step     5, accuracy_c1 0.73, accuracy 0.69, cost 5.27807
step     6, accuracy_c1 0.51, accuracy 0.24, cost 6.10448
step     7, accuracy_c1 0.58, accuracy 0.65, cost 4.99372
step     8, accuracy_c1 0.41, accuracy 0.42, cost 8.33803
step     9, accuracy_c1 0.43, accuracy 0.71, cost 5.35373
step    10, accuracy_c1 0.46, accuracy 0.77, cost 5.17793
step    11, accuracy_c1 0.52, accuracy 0.70, cost 5.14624
step    12, accuracy_c1 0.58, accuracy 0.73, cost 4.54902
step    13, accuracy_c1 0.45, accuracy 0.76, cost 4.35474
step    14, accuracy_c1 0.32, accuracy 0.80, cost 4.7549
step    15, accuracy_c1 0.82, accuracy 0.82, cost 5.00536
step    16, accuracy_c1 0.75, accuracy 0.79, cost 4.53228
step    17, accuracy_c1 0.60, accuracy 0.72, cost 5.83573
step    18, accuracy_c1 0.72, accuracy 0.87, cost 3.60838
step    19, accuracy_c1 0.77, accuracy 0.85, cost 4.90098
step    20, accuracy_c1 0.75, accuracy 0.86, cost 3.57642
step    21, accuracy_c1 0.64, accuracy 0.73, cost 3.85426
John1231983 commented 7 years ago

Thanks you for sharing your result. I am using your same setting (with different that I used 5 training image)

index,segTRI,mask,preprocessed
01,/home/IBSR_02/IBSR_02_segTRI_ana.nii.gz,/home/IBSR_02/IBSR_02_ana_brainmask.nii.gz,/home/IBSR_02/IBSR_02_ana_strip.nii.gz
02,/home/IBSR_03/IBSR_03_segTRI_ana.nii.gz,/home/IBSR_03/IBSR_03_ana_brainmask.nii.gz,/home/IBSR_03/IBSR_03_ana_strip.nii.gz
03,/home/IBSR_04/IBSR_04_segTRI_ana.nii.gz,/home/IBSR_04/IBSR_04_ana_brainmask.nii.gz,/home/IBSR_04/IBSR_04_ana_strip.nii.gz
04,/home/IBSR_05/IBSR_05_segTRI_ana.nii.gz,/home/IBSR_05/IBSR_05_ana_brainmask.nii.gz,/home/IBSR_05/IBSR_05_ana_strip.nii.gz
05,/home/IBSR_07/IBSR_07_segTRI_ana.nii.gz,/home/IBSR_07/IBSR_07_ana_brainmask.nii.gz,/home/IBSR_07/IBSR_07_ana_strip.nii.gz

These images are got from original dataset. Only raw image is processed by normalization. Could you please share your script to make isotropic image from nii image? I want to test it to detect which was wrong in my data.

python train.py -i 100 -s 1 -g 0 -f dataset_train.csv --in_channels 1
Namespace(display_step=1, gpu=0, in_channels=1, input_file='dataset_train.csv', iteration=100, learning_rate=0.001, n_batch=1, n_classes=4, out='vrn.npz', shape=[80, 80, 80], weight_decay=0.0005)
step     0, accuracy_c1 0.26, accuracy 0.26, cost 6.93154
step     1, accuracy_c1 0.52, accuracy 0.52, cost 6.93154
step     2, accuracy_c1 0.02, accuracy 0.02, cost 6.93154
step     3, accuracy_c1 0.01, accuracy 0.01, cost 6.93155
step     4, accuracy_c1 0.00, accuracy 0.00, cost 6.93153
step     5, accuracy_c1 0.01, accuracy 0.01, cost 6.93154
step     6, accuracy_c1 0.02, accuracy 0.02, cost 6.93154
step     7, accuracy_c1 0.55, accuracy 0.55, cost 6.93147
step     8, accuracy_c1 0.24, accuracy 0.24, cost 6.93151
step     9, accuracy_c1 0.60, accuracy 0.60, cost 6.9315
step    10, accuracy_c1 0.56, accuracy 0.56, cost 6.93148
step    11, accuracy_c1 0.31, accuracy 0.31, cost 6.93153
step    12, accuracy_c1 0.01, accuracy 0.01, cost 6.93154
step    13, accuracy_c1 0.00, accuracy 0.00, cost 6.93154
step    14, accuracy_c1 0.19, accuracy 0.19, cost 6.93152
step    15, accuracy_c1 0.21, accuracy 0.21, cost 6.93154
step    16, accuracy_c1 0.19, accuracy 0.19, cost 6.93153
step    17, accuracy_c1 0.31, accuracy 0.31, cost 6.93152
step    18, accuracy_c1 0.37, accuracy 0.37, cost 6.93154
step    19, accuracy_c1 0.01, accuracy 0.01, cost 6.93154
step    20, accuracy_c1 0.01, accuracy 0.01, cost 6.93154
Ryo-Ito commented 7 years ago

Here is how my reslicing code look like. You have to download http://stnava.github.io/ANTs/ to run this.

import os
import numpy as np
import nibabel as nib

scalar_file = "/path/to/dataset/IBSR_02_ana_strip.nii.gz"
i = 2

scalar_img = nib.load(scalar_file)
empty_data = np.empty((256,) * 3, dtype=np.float32)
affine = np.copy(scalar_img.affine)
affine[0: 3, 0: 3] = np.eye(3)
empty_img = nib.Nifti1Image(empty_data, affine)
nib.save(empty_img, "ref.nii.gz")

cmd = "antsApplyTransforms -d 3 -i {0} -r ref.nii.gz -o /path/to/dataset/IBSR_{1:02d}_iso.nii.gz".format(scalar_file, i)
os.system(cmd)
img = nib.load("/path/to/dataset/IBSR_{0:02d}_iso.nii.gz".format(i))
data = img.get_data()
data = data.astype(np.float32)
nib.save(nib.Nifti1Image(data, img.affine), "/path/to/dataset/IBSR_{0:02d}_iso.nii.gz".format(i))

But I tried with the raw images, which are anisotropic, and still got decent result. I think the problem lies somewhere else.

$ cat test.csv
index,segTRI,mask,preprocessed
01,/IBSR/IBSR_01/IBSR_01_segTRI_ana.nii.gz,/IBSR/IBSR_01/IBSR_01_ana_brainmask.nii.gz,/IBSR/IBSR_01/IBSR_01_ana_strip.nii.gz
$ python train.py -i 100 -s 1 -g 0 -f test.csv --in_channels 1
Namespace(display_step=1, gpu=0, in_channels=1, input_file='test.csv', iteration=100, learning_rate=0.001, n_batch=1, n_classes=4, out='vrn.npz', shape=[80, 80, 80], weight_decay=0.0005)
step     0, accuracy_c1 0.32, accuracy 0.31, cost 6.93154
step     1, accuracy_c1 0.75, accuracy 0.48, cost 6.89648
step     2, accuracy_c1 0.83, accuracy 0.61, cost 5.99953
step     3, accuracy_c1 0.86, accuracy 0.69, cost 4.49443
step     4, accuracy_c1 0.74, accuracy 0.47, cost 4.83167
step     5, accuracy_c1 0.85, accuracy 0.84, cost 3.16566
step     6, accuracy_c1 0.89, accuracy 0.88, cost 2.76989
step     7, accuracy_c1 0.58, accuracy 0.55, cost 6.40945
step     8, accuracy_c1 0.72, accuracy 0.71, cost 4.14536
step     9, accuracy_c1 0.74, accuracy 0.76, cost 3.10135
step    10, accuracy_c1 0.58, accuracy 0.60, cost 4.79431
step    11, accuracy_c1 0.48, accuracy 0.59, cost 9.03028
step    12, accuracy_c1 0.73, accuracy 0.76, cost 2.95961
step    13, accuracy_c1 0.54, accuracy 0.59, cost 5.28422
step    14, accuracy_c1 0.32, accuracy 0.69, cost 4.5657
step    15, accuracy_c1 0.14, accuracy 0.34, cost 5.98941
step    16, accuracy_c1 0.64, accuracy 0.84, cost 3.81229
step    17, accuracy_c1 0.61, accuracy 0.71, cost 4.24793
step    18, accuracy_c1 0.75, accuracy 0.75, cost 2.95071
step    19, accuracy_c1 0.62, accuracy 0.79, cost 3.46504
step    20, accuracy_c1 0.74, accuracy 0.72, cost 3.47858
step    21, accuracy_c1 0.71, accuracy 0.71, cost 3.30771
John1231983 commented 7 years ago

This is good news. I detect the problem. It is from chainer. I am using Cuda 8.0 and cuDNN 4.0, and it got above wrong result. To solve it, I installed cuDNN 5.0 which is suitable for version of cuda 8.0 (suggested by Caffe), although cuDNN 4.0 does not show any error or warning. After install cuDNN 5.0 and reinstall chainer. I got the expected result. I will test the testing phase after finish training. Current, the segmentation result in testing phase from 100 iterations is not good. I will train more iterations, such as 10000. I think the error can be write in your README because it may be useful to other researchers. Thank you about your support again

step     0, accuracy_c1 0.32, accuracy 0.23, cost 6.93154
step    10, accuracy_c1 0.59, accuracy 0.73, cost 4.33033
step    20, accuracy_c1 0.70, accuracy 0.78, cost 2.99197
step    30, accuracy_c1 0.58, accuracy 0.76, cost 3.59981
step    40, accuracy_c1 0.73, accuracy 0.83, cost 2.78352
step    50, accuracy_c1 0.63, accuracy 0.83, cost 2.63419
step    60, accuracy_c1 0.53, accuracy 0.75, cost 3.87954
step    70, accuracy_c1 0.87, accuracy 0.92, cost 1.54215
step    80, accuracy_c1 0.77, accuracy 0.86, cost 2.49131
step    90, accuracy_c1 0.69, accuracy 0.86, cost 2.38393
step   100, accuracy_c1 0.79, accuracy 0.91, cost 1.56825
Ryo-Ito commented 7 years ago

I'm glad to hear your problem solved. I will include the notice in my README soon.