rogertrullo / generate_3d_data

scripts for extracting 3d patches from nii files and saving them into hdf5 files
0 stars 0 forks source link

Python script to store patch into hdf5 [shoter] #1

Open John1231983 opened 7 years ago

John1231983 commented 7 years ago

Hello, Thank you for sharing your work. Currently, I am based on some current works and this to extract a patch [80x80x80] from nii file and save to hdf5. It is close related to your work. However, it has only 69 lines code. However, I don't know my implementation is correct or not. Could you look at it and give me some comment? I am wondering about the matrix order (raw_patches and label_patches). Thanks

My code is located at https://gist.github.com/John1231983/63c35d1cb4e770350fc76c5d38722e9c

rogertrullo commented 7 years ago

Hi @John1231983 , this repository is mainly for converting a format called RTDicom to regular niftii files, because I coudn't find a function for that in the main libraries like itk and so on. However I also extract patches from the niftii files and save them into an hdf5 file. I have checked your code, and I think the notation is a bit unclear to me (I am not very familiar with the Slice option from python, so probably is my fault..). I usually use the slicing based on indexing e.g. X[start_slice:end_slice,start_row:end_row,start_col,end_col]. I also generally use compression to save the space in the hdf5 files; sth ike this:

comp_kwargs = {'compression': 'gzip', 'compression_opts': 1}
    with h5py.File(train_filename, 'w') as f:
        f.create_dataset('data', data=X, **comp_kwargs)
        f.create_dataset('label', data=y, **comp_kwargs)

About the order, It seems to me that you want to use Caffe; because the order is [Num_samples, Channel, Slices, Rows, Cols]. It seems to be fine; however for the labels, there should not be a channel dimension (even if it is 1). This is assuming you want to do segmentation, and would probably use the softmax_with_loss function in Caffe. Please let me know if I can help you somehow.

John1231983 commented 7 years ago

Thank you for your reply. Your are right. I am performing the segmentation task, especially medical image with volumetric data *.nii. Due to the limited memory , I cannot load whole .nii file to the network (defined by prototxt file). Hence, I need to randomly extract n_sample patches for each patient and store them in hdf5 file (A hdf5 corresponds to a patient). Based on your comment, I revised the labels data and raw data as follows:

                     Original data                   Extracted data
Raw:           256x128x256                 n_samplex1x80x80x80
Label:         256x128x256                 n_samplex80x80x80 (1 is deleted)

Is it correct? I am using modified version of caffe (U-net version) to train the fcn-32s, However, the loss is increasing. I think the reason is from data preparing. Thus, I think your repository is right place. Thanks repository

This is my output of fcn-32s and the full prototxt https://gist.github.com/John1231983/5186cb94cc1d7d9b19b66e91541a3312

I0218 23:07:24.980823 32302 solver.cpp:228] Iteration 0, loss = 709790
I0218 23:07:24.980847 32302 solver.cpp:244]     Train net output #0: loss = 709790 (* 1 = 709790 loss)
I0218 23:07:24.980852 32302 sgd_solver.cpp:106] Iteration 0, lr = 0.001
I0218 23:07:37.629266 32302 solver.cpp:228] Iteration 20, loss = 4.27115e+07
I0218 23:07:37.629292 32302 solver.cpp:244]     Train net output #0: loss = 4.47163e+07 (* 1 = 4.47163e+07 loss)
I0218 23:07:37.629297 32302 sgd_solver.cpp:106] Iteration 20, lr = 0.001
I0218 23:07:50.413154 32302 solver.cpp:228] Iteration 40, loss = 4.47163e+07
I0218 23:07:50.413211 32302 solver.cpp:244]     Train net output #0: loss = 4.47163e+07 (* 1 = 4.47163e+07 loss)
rogertrullo commented 7 years ago

It seems that the network has overflowed; if you see the loss is the same big value at iteration 20 and 40. There are some things to make sure. First, is the number of classes 4? Second, the learning rate is clearly too big; specially because the loss is not normalized.

John1231983 commented 7 years ago

Thanks, I checked it and change the lr=1e-10. I also revised the prototxt by remove dropout layer, crop layer, ignore the label=0, instead of 255. However, the loss result does not show as expected

I also change the python code to delete channel in label, using n_samplex80x80x80, instead of n_samplex1x80x80as your mentioned. Could you verify it help me at https://gist.github.com/John1231983/63c35d1cb4e770350fc76c5d38722e9c? Why do we need ignore channel dimension in label. I am using SoftmaxWithLoss layer

I0219 01:03:11.041939  2107 solver.cpp:228] Iteration 0, loss = 287812
I0219 01:03:11.041978  2107 solver.cpp:244]     Train net output #0: loss = 287812 (* 1 = 287812 loss)
I0219 01:03:11.041985  2107 sgd_solver.cpp:106] Iteration 0, lr = 1e-10
I0219 01:03:30.369364  2107 solver.cpp:228] Iteration 20, loss = 419216
I0219 01:03:30.369395  2107 solver.cpp:244]     Train net output #0: loss = 548716 (* 1 = 548716 loss)
I0219 01:03:30.369401  2107 sgd_solver.cpp:106] Iteration 20, lr = 1e-10
I0219 01:03:49.889261  2107 solver.cpp:228] Iteration 40, loss = 415737
I0219 01:03:49.889420  2107 solver.cpp:244]     Train net output #0: loss = 610386 (* 1 = 610386 loss)
I0219 01:03:49.889427  2107 sgd_solver.cpp:106] Iteration 40, lr = 1e-10
I0219 01:04:08.893386  2107 solver.cpp:228] Iteration 60, loss = 362652
I0219 01:04:08.893414  2107 solver.cpp:244]     Train net output #0: loss = 321173 (* 1 = 321173 loss)
I0219 01:04:08.893419  2107 sgd_solver.cpp:106] Iteration 60, lr = 1e-10
I0219 01:04:28.065129  2107 solver.cpp:228] Iteration 80, loss = 383949
I0219 01:04:28.065279  2107 solver.cpp:244]     Train net output #0: loss = 387736 (* 1 = 387736 loss)
I0219 01:04:28.065286  2107 sgd_solver.cpp:106] Iteration 80, lr = 1e-10
....
I0219 01:07:45.598567  2107 solver.cpp:228] Iteration 280, loss = 424385
I0219 01:07:45.598781  2107 solver.cpp:244]     Train net output #0: loss = 331888 (* 1 = 331888 loss)
I0219 01:07:45.598801  2107 sgd_solver.cpp:106] Iteration 280, lr = 1e-10
I0219 01:08:07.838287  2107 solver.cpp:228] Iteration 300, loss = 470599
I0219 01:08:07.838320  2107 solver.cpp:244]     Train net output #0: loss = 477719 (* 1 = 477719 loss)
I0219 01:08:07.838326  2107 sgd_solver.cpp:106] Iteration 300, lr = 1e-10
John1231983 commented 7 years ago

In addition, I guess the error may come from setting parameters in the Conv and Pooling layer. I am using Unet's caffe version, hence, I think convolution parameter must be has engine=CAFFE and Pooling layer must be has enginer=CUDNN to support 3D operations. Do you think so? I cannot check because GPU resource problem when I used that setting. I am working with Titan X Pascal

rogertrullo commented 7 years ago

About the channel, I don't remember exactly if it will work with the channel dimension, but anyway is still the same data... for the 3d operations, I used this PR https://github.com/BVLC/caffe/pull/3983 In order to use the 3d pooling operations. I remember you have to change a line of code but in the link is explained well, and after that you would want to define engine cudnn for the pooling operations. And about the loss, you still need to wait for more iterations (say thousand) to see if it's actually doing sth. Maybe change the lr to 1e-7, I think 1e-10 is too small

John1231983 commented 7 years ago

Thank you for your guidance. I tried to used the PR BVLC/caffe#3983 as you mentioned. Have you successful to train for fcn-32s for volumtric data with the PR. I spends two days to implement it but it was fails. This is my python script to make the net. In shortly, the convolution, pooling and ReLU can show as follows

...
layer {
  name: "conv1_2"
  type: "Convolution"
  bottom: "conv1_1"
  top: "conv1_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 32
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
  bias_filler {
      type: "constant"
      value: 0
    }
    engine: CAFFE
  }
}
layer {
  name: "relu1_2"
  type: "ReLU"
  bottom: "conv1_2"
  top: "conv1_2"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1_2"
  top: "pool1"
  pooling_param {
    pool: AVE
    kernel_size: 2
    stride: 2
    engine: CUDNN
  }
}

Do you think my setting for 3D convolution, 3D pooling and 3D ReLU are correct? I think has something wrong for 3D operations. I am using original PR 3983. Thanks