Open John1231983 opened 7 years ago
Hi @John1231983 , this repository is mainly for converting a format called RTDicom to regular niftii files, because I coudn't find a function for that in the main libraries like itk and so on.
However I also extract patches from the niftii files and save them into an hdf5 file. I have checked your code, and I think the notation is a bit unclear to me (I am not very familiar with the Slice option from python, so probably is my fault..). I usually use the slicing based on indexing e.g. X[start_slice:end_slice,start_row:end_row,start_col,end_col]
.
I also generally use compression to save the space in the hdf5 files; sth ike this:
comp_kwargs = {'compression': 'gzip', 'compression_opts': 1}
with h5py.File(train_filename, 'w') as f:
f.create_dataset('data', data=X, **comp_kwargs)
f.create_dataset('label', data=y, **comp_kwargs)
About the order, It seems to me that you want to use Caffe; because the order is [Num_samples, Channel, Slices, Rows, Cols]
. It seems to be fine; however for the labels, there should not be a channel dimension (even if it is 1). This is assuming you want to do segmentation, and would probably use the softmax_with_loss function in Caffe.
Please let me know if I can help you somehow.
Thank you for your reply.
Your are right. I am performing the segmentation task, especially medical image with volumetric data *.nii. Due to the limited memory , I cannot load whole .nii file to the network (defined by prototxt file). Hence, I need to randomly extract n_sample
patches for each patient and store them in hdf5 file (A hdf5 corresponds to a patient). Based on your comment, I revised the labels data and raw data as follows:
Original data Extracted data
Raw: 256x128x256 n_samplex1x80x80x80
Label: 256x128x256 n_samplex80x80x80 (1 is deleted)
Is it correct? I am using modified version of caffe (U-net version) to train the fcn-32s, However, the loss is increasing. I think the reason is from data preparing. Thus, I think your repository is right place. Thanks repository
This is my output of fcn-32s and the full prototxt https://gist.github.com/John1231983/5186cb94cc1d7d9b19b66e91541a3312
I0218 23:07:24.980823 32302 solver.cpp:228] Iteration 0, loss = 709790
I0218 23:07:24.980847 32302 solver.cpp:244] Train net output #0: loss = 709790 (* 1 = 709790 loss)
I0218 23:07:24.980852 32302 sgd_solver.cpp:106] Iteration 0, lr = 0.001
I0218 23:07:37.629266 32302 solver.cpp:228] Iteration 20, loss = 4.27115e+07
I0218 23:07:37.629292 32302 solver.cpp:244] Train net output #0: loss = 4.47163e+07 (* 1 = 4.47163e+07 loss)
I0218 23:07:37.629297 32302 sgd_solver.cpp:106] Iteration 20, lr = 0.001
I0218 23:07:50.413154 32302 solver.cpp:228] Iteration 40, loss = 4.47163e+07
I0218 23:07:50.413211 32302 solver.cpp:244] Train net output #0: loss = 4.47163e+07 (* 1 = 4.47163e+07 loss)
It seems that the network has overflowed; if you see the loss is the same big value at iteration 20 and 40. There are some things to make sure. First, is the number of classes 4? Second, the learning rate is clearly too big; specially because the loss is not normalized.
Thanks, I checked it and change the lr=1e-10
. I also revised the prototxt by remove dropout layer, crop layer, ignore the label=0, instead of 255. However, the loss result does not show as expected
I also change the python code to delete channel in label, using n_samplex80x80x80
, instead of n_samplex1x80x80
as your mentioned. Could you verify it help me at https://gist.github.com/John1231983/63c35d1cb4e770350fc76c5d38722e9c? Why do we need ignore channel dimension in label. I am using SoftmaxWithLoss
layer
I0219 01:03:11.041939 2107 solver.cpp:228] Iteration 0, loss = 287812
I0219 01:03:11.041978 2107 solver.cpp:244] Train net output #0: loss = 287812 (* 1 = 287812 loss)
I0219 01:03:11.041985 2107 sgd_solver.cpp:106] Iteration 0, lr = 1e-10
I0219 01:03:30.369364 2107 solver.cpp:228] Iteration 20, loss = 419216
I0219 01:03:30.369395 2107 solver.cpp:244] Train net output #0: loss = 548716 (* 1 = 548716 loss)
I0219 01:03:30.369401 2107 sgd_solver.cpp:106] Iteration 20, lr = 1e-10
I0219 01:03:49.889261 2107 solver.cpp:228] Iteration 40, loss = 415737
I0219 01:03:49.889420 2107 solver.cpp:244] Train net output #0: loss = 610386 (* 1 = 610386 loss)
I0219 01:03:49.889427 2107 sgd_solver.cpp:106] Iteration 40, lr = 1e-10
I0219 01:04:08.893386 2107 solver.cpp:228] Iteration 60, loss = 362652
I0219 01:04:08.893414 2107 solver.cpp:244] Train net output #0: loss = 321173 (* 1 = 321173 loss)
I0219 01:04:08.893419 2107 sgd_solver.cpp:106] Iteration 60, lr = 1e-10
I0219 01:04:28.065129 2107 solver.cpp:228] Iteration 80, loss = 383949
I0219 01:04:28.065279 2107 solver.cpp:244] Train net output #0: loss = 387736 (* 1 = 387736 loss)
I0219 01:04:28.065286 2107 sgd_solver.cpp:106] Iteration 80, lr = 1e-10
....
I0219 01:07:45.598567 2107 solver.cpp:228] Iteration 280, loss = 424385
I0219 01:07:45.598781 2107 solver.cpp:244] Train net output #0: loss = 331888 (* 1 = 331888 loss)
I0219 01:07:45.598801 2107 sgd_solver.cpp:106] Iteration 280, lr = 1e-10
I0219 01:08:07.838287 2107 solver.cpp:228] Iteration 300, loss = 470599
I0219 01:08:07.838320 2107 solver.cpp:244] Train net output #0: loss = 477719 (* 1 = 477719 loss)
I0219 01:08:07.838326 2107 sgd_solver.cpp:106] Iteration 300, lr = 1e-10
In addition, I guess the error may come from setting parameters in the Conv and Pooling layer. I am using Unet's caffe version, hence, I think convolution parameter must be has engine=CAFFE
and Pooling layer must be has enginer=CUDNN
to support 3D operations. Do you think so? I cannot check because GPU resource problem when I used that setting. I am working with Titan X Pascal
About the channel, I don't remember exactly if it will work with the channel dimension, but anyway is still the same data... for the 3d operations, I used this PR https://github.com/BVLC/caffe/pull/3983 In order to use the 3d pooling operations. I remember you have to change a line of code but in the link is explained well, and after that you would want to define engine cudnn for the pooling operations. And about the loss, you still need to wait for more iterations (say thousand) to see if it's actually doing sth. Maybe change the lr to 1e-7, I think 1e-10 is too small
Thank you for your guidance. I tried to used the PR BVLC/caffe#3983 as you mentioned. Have you successful to train for fcn-32s for volumtric data with the PR. I spends two days to implement it but it was fails. This is my python script to make the net. In shortly, the convolution, pooling and ReLU can show as follows
...
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
engine: CAFFE
}
}
layer {
name: "relu1_2"
type: "ReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_2"
top: "pool1"
pooling_param {
pool: AVE
kernel_size: 2
stride: 2
engine: CUDNN
}
}
Do you think my setting for 3D convolution, 3D pooling and 3D ReLU are correct? I think has something wrong for 3D operations. I am using original PR 3983. Thanks
Hello, Thank you for sharing your work. Currently, I am based on some current works and this to extract a patch [80x80x80] from nii file and save to hdf5. It is close related to your work. However, it has only 69 lines code. However, I don't know my implementation is correct or not. Could you look at it and give me some comment? I am wondering about the matrix order (
raw_patches
andlabel_patches
). ThanksMy code is located at https://gist.github.com/John1231983/63c35d1cb4e770350fc76c5d38722e9c