NifTK / NiftyNet

[unmaintained] An open-source convolutional neural networks platform for research in medical image analysis and image-guided therapy
http://niftynet.io
Apache License 2.0
1.36k stars 404 forks source link

FEATURE: different padding modes for different inputs #448

Open brynolff opened 4 years ago

brynolff commented 4 years ago

When reading images into NiftyNet they are loaded as

    self.readers[0].add_preprocessing_layers(
        volume_padding_layer + normalisation_layers + augmentation_layers)

i.e., first padding, then normalisation and finally augmentation. This is applied the same way independently of if it is an image, label, weight or sampler.

My main issue here is with the sampler and weight. Here they are first padded (using a default np.min) and then augmented e.g. with rotation or elastic deformation, which means that padded sections may be rotated or deformed into the field-of-view and also included into the cost function. In the regression case, this means that the network is forced to learn how to go from np.min-padded sections in the input to np.min-padded sections in the output.

I would argue that "weight" and "sampler" inputs should be padded with zeros so that data is never evaluated in the padded sections. Or, alternatively that the padding method is chosen in the config, e.g:

[T1] path_to_search = ./example_volumes/monomodal_parcellation filename_contains = T1 filename_not_contains = spatial_window_size = (32, 32, 32) pixdim = (1.0, 1.0, 1.0) axcodes=(A, R, S) interp_order = 3 padding_mode = minimum

[WEIGHT] path_to_search = ./example_volumes/monomodal_parcellation filename_contains = weights filename_not_contains = spatial_window_size = (32, 32, 32) pixdim = (1.0, 1.0, 1.0) axcodes=(A, R, S) interp_order = 3 padding_mode = zero