Examples for input formats

fabsta commented 5 years ago

Hi Renato,

thanks for the great repo. I would like to use it as well!

Can you give a bit more information on the input formats for volumes and labels? E.g.: I have an ndarray for each of my volumes and a single label (I am working on a classification rather than segmenation problem) Thx!

renato145 commented 5 years ago

Hi, I will put more examples for different tasks as soon as I have some time off. For classification you can do it like this:

# Lets assume:
# 1) you have your bcolz array on 'bcolz_array_path'
# 2) a csv file which includes your classification labels on 'csv_path'
# 3) and the label column is named 'lbl_col'

vol_size = (64,64,64)
bs = 32
data = (fastai_scans.VolumeItemList.from_paths(bcolz_array_path, csv_path)
                                   .random_split_by_pct(0.2, seed=7)
                                   .label_from_metadata(lbl_col)
                                   .transform(scans.get_transforms())
                                   .databunch(bs=bs)
                                   .normalize())
m = fastai_scans.models.Simple3d(vol_size, num_layers=6, nf=32, hidden=200, drop_out=0.5)
learn = Learner(data, m, metrics=accuracy)

# if you are doing a binary classification and want AUC scores you can use like this:
learn = Learner(data, m, metrics=accuracy, callback_fns=[fastai_scans.AucLogger])
# then after training you can call:
learn.auc_logger.print_report()

fabsta commented 5 years ago

Thanks for the reply! Yes, some easy (even made up) examples would be great. I am adding where I am here, maybe that helps you with your answer.

bcolz_array_path: I assume the bcols array keeps all your 3D-images, right? I never used bcolz, so I assume bcolz_array_path is the path to a previously saved numpy datastructure like for 64x64x64 images: array([64, 64, 64], dtype=int16) or does it hold multiple images [no_images, 64,64,64]

csv_path I understand that the csv_file looks something like this

label
3
3

But how is the mapping to the 3D images done? Thanks!

at110 commented 5 years ago

Hi Fabian,

I also ran into same issue. I am not very familiar with bcolz. With the help of Renato I was able to get data from nifti to bcolz . My data is (x=256,y=256 and z= 128).

My nifti files are stored at /data/train_nifti/ and masks are stored at /data/train_mask_nifti/ Here is code

PATH_train_nii = Path('/data/train_nifti/')
PATH_train_mask = Path('/data/train_mask_nifti/')

(PATH/'train_data_preprossed').mkdir(exist_ok=True)
train_data_bolz = (PATH/'train_data_preprossed')
(PATH/'train_labels_preprossed').mkdir(exist_ok=True)
train_labels_bolz = (PATH/'train_labels_preprossed')

nifti_files = list(PATH_train_nii.iterdir())
mask_files=  list(PATH_train_mask.iterdir())
mask_files.sort()
nifti_files.sort()

data_bcolz = bcolz.carray(np.zeros([0,128,256,256], dtype=np.float32), chunklen=1,mode='w', rootdir=train_data_bolz)
mask_bcolz = bcolz.carray(np.zeros([0,128,256,256], dtype=np.float32),chunklen=1, mode='w', rootdir=train_labels_bolz)

for each_file,each_mask in zip(nifti_files,mask_files):
    nifti_data = nib.load(str(each_file)).get_data()
    mask_data = nib.load(str(each_mask)).get_data()    
    data_bcolz.append(nifti_data.transpose(2,0,1))
    mask_bcolz.append(mask_data.transpose(2,0,1))

data_bcolz.flush()
mask_bcolz.flush()

then you can use

data = (fastai_scans.SegmentationItemList.from_paths(train_data_bolz, train_labels_bolz)
                                         .random_split_by_pct(0.2, seed=7)
                                         .label_from_bcolz()
                                         .transform(fastai_scans.get_transforms(), tfm_y=True)
                                         .databunch(bs=bs)
                                         .normalize())

You can verify if your data is in good shape by either data.show_batch(2) or

img_t= data.train_ds.y[0].data
img =  img_t.cpu().permute(2,1,0).numpy()
img = img.squeeze()
img.shape
fig,ax = plt.subplots(figsize=[3,3])
ax.imshow(img[:,:,21],  alpha=0.5)

I hope this will help.

at110 commented 5 years ago

Realized that you were asking about classification problem and I gave information about segmentation but conversion of 3d/4d to bcolz should be same.

renato145 commented 5 years ago

But how is the mapping to the 3D images done?

If the bcolz array have the form: [no_images,64,64,64], then the csv file should have 'no_images' elements. The first row in the csv should have the label for the first element in the array and so on.

In the classification data structure fastai_scans.VolumeItemList, the method .label_from_metadata(lbl_col) will link the elements on the array with the labels on the csv file.

xuzhang5788 commented 5 years ago

Please correct me if I am wrong. I think your dataset is 4D, not 5D which is a real 3D volume. The shape of trainX should be (nb_samples, nb_channels, x, y, z). Actually, you used each slice as an image and did normal image related tasks. I am interested in real 3D convolutional neural network using fastai library, like convolution3D() in Keras. Thanks for the great repo.

renato145 commented 5 years ago

Its for 5d data with 3d convolutions. The example I showed above was for when you only have one channel, in which case it will preprocess the array to have the shape [no_mages,1,64,64,64].

renato145 / fastai_scans

Examples for input formats #1