NifTK / NiftyNet

[unmaintained] An open-source convolutional neural networks platform for research in medical image analysis and image-guided therapy
http://niftynet.io
Apache License 2.0
1.36k stars 403 forks source link

Unable to run inference on customized highres3dnet #401

Closed koriavinash1 closed 5 years ago

koriavinash1 commented 5 years ago

Hi, I trained 3D classification network using highres3dnet as a backbone, but I'm unable to infer (getting segmentation fault)

the command used for inference: python net_classify.py inference -c ./config/QualityControlConfigEval.ini

corresponding config.ini used:

===============================

[T1]
csv_file = ./T1_data.csv
path_to_search =
filename_contains =
filename_not_contains =
pixdim = (1.0, 1.0, 1.0)
spatial_window_size = (128, 128, 128)
interp_order = 3

[label]
csv_file = ./Label_data.csv
spatial_window_size = (1, 1, 1)
interp_order = -1

######################## system configuration sections
[SYSTEM]
cuda_devices = ""
dataset_split_file = ./cross_validation_fold_01.txt
model_dir = ./models/model_highres3dnet

[NETWORK]
name = highres3dnet
activation_function = relu
batch_size = 1
volume_padding_size=0
histogram_ref_file = ./models/model_highres3dnet/histogram_ref_file.txt
cutoff = (0.01, 0.99)
normalisation = True
whitening = True
normalise_foreground_only=True

[INFERENCE]
inference_iter = -1
border=1
# save_seg_dir=./INFERENCE
output_interp_order=0
spatial_window_size=(128, 128, 128)

[EVALUATION]
save_csv_dir = ./EVALUATION
evaluations = accuracy

[CLASSIFICATION]
image = T1
label = label
output_prob = False
num_classes = 2
===========================================

This results in a Segmentation fault, after loading model weights and data, the output looks like:

NiftyNet version 0.5.0+78.g6eb3821.dirty
[CUSTOM]
-- num_classes: 2
-- output_prob: True
-- label_normalisation: False
-- sampler: ()
-- inferred: ()
-- label: ('label',)
-- image: ('T1',)
-- name: net_classify
[CONFIG_FILE]
-- path: /home/users/kavinash/QualityControl/NiftyNet/config/QualityControlConfigEval.ini
[T1]
-- csv_file: T1_data.csv
-- filename_contains: ()
-- filename_not_contains: ()
-- filename_removefromid: 
-- interp_order: 3
-- loader: None
-- pixdim: (1.0, 1.0, 1.0)
-- axcodes: ()
-- spatial_window_size: (128, 128, 128)
[LABEL]
-- csv_file: Label_data.csv
-- filename_contains: None
-- filename_not_contains: ()
-- filename_removefromid: 
-- interp_order: -1
-- loader: None
-- pixdim: ()
-- axcodes: ()
-- spatial_window_size: (1, 1, 1)
[SYSTEM]
-- cuda_devices: ""
-- num_threads: 1
-- num_gpus: 0
-- model_dir: /home/users/kavinash/QualityControl/NiftyNet/models/model_highres3dnet
-- dataset_split_file: cross_validation_fold_01.txt
-- event_handler: ('model_saver', 'model_restorer', 'sampler_threading', 'apply_gradients', 'output_interpreter', 'console_logger', 'tensorboard_logger', 'performance_logger')
-- iteration_generator: iteration_generator
-- action: inference
[NETWORK]
-- name: highres3dnet
-- activation_function: relu
-- batch_size: 1
-- smaller_final_batch_mode: pad
-- decay: 0.0
-- reg_type: L2
-- volume_padding_size: (0, 0, 0)
-- volume_padding_mode: minimum
-- volume_padding_to_size: (0,)
-- window_sampling: resize
-- queue_length: 2
-- multimod_foreground_type: and
-- histogram_ref_file: /home/users/kavinash/QualityControl/NiftyNet/models/model_highres3dnet/histogram_ref_file.txt
-- norm_type: percentile
-- cutoff: (0.01, 0.99)
-- foreground_type: otsu_plus
-- normalisation: True
-- rgb_normalisation: False
-- whitening: True
-- normalise_foreground_only: True
-- weight_initializer: he_normal
-- bias_initializer: zeros
-- keep_prob: 1.0
-- weight_initializer_args: {}
-- bias_initializer_args: {}
[INFERENCE]
-- spatial_window_size: (128, 128, 128)
-- inference_iter: 1120
-- dataset_to_infer: 
-- save_seg_dir: ./INFERENCE
-- output_postfix: _niftynet_out
-- output_interp_order: 0
-- border: (0, 0, 0)
-- fill_constant: 0.0
[EVALUATION]
-- evaluations: accuracy
-- save_csv_dir: ./EVALUATION
INFO:niftynet: starting classification application
INFO:niftynet: [T1] using existing csv file /oak/stanford/groups/russpold/data/openneuro.org/derivatives/kavinashscoolnet-1.0.0/csv_data/T1_data.csv, skipped filenames search
INFO:niftynet: [label] using existing csv file /oak/stanford/groups/russpold/data/openneuro.org/derivatives/kavinashscoolnet-1.0.0/csv_data/Label_data.csv, skipped filenames search
WARNING:niftynet: Loading from existing partitioning file /oak/stanford/groups/russpold/data/openneuro.org/derivatives/kavinashscoolnet-1.0.0/csv_data/cross_validation_fold_01.txt, ignoring partitioning ratios.
INFO:niftynet: 

Number of subjects 272, input section names: ['subject_id', 'T1', 'label']
Dataset partitioning:
-- training 190 cases (69.85%),
-- validation 54 cases (19.85%),
-- inference 28 cases (10.29%).

INFO:niftynet: Image reader: loading 28 subjects from sections ('T1',) as input [image]
INFO:niftynet: normalisation histogram reference models ready for image:('T1',)
2019-06-07 12:33:21.128970: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2019-06-07 12:33:21.135517: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-06-07 12:33:21.135669: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5557d12ca240 executing computations on platform Host. Devices:
2019-06-07 12:33:21.135703: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
INFO:niftynet: reading size of preprocessed images
WARNING:niftynet: sampler queue_length should be larger than batch_size, defaulting to batch_size * 5.0 (5).
INFO:niftynet: initialised resize sampler {'image': (1, 128, 128, 128, 1, 1), 'image_location': (1, 7)} 
WARNING:niftynet: From /home/users/kavinash/QualityControl/NiftyNet/niftynet/engine/application_initializer.py:106: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with distribution=normal is deprecated and will be removed in a future version.
Instructions for updating:
`normal` is a deprecated alias for `truncated_normal`
INFO:niftynet: using HighRes3DNet
INFO:niftynet: Initialising Dataset from 28 subjects...
WARNING:niftynet: From /home/users/kavinash/QualityControl/NiftyNet/niftynet/engine/image_window_dataset.py:300: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
    tf.py_function, which takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.

INFO:niftynet: net_out.shape may need to be resized: (1, 2)
INFO:niftynet: Restoring parameters from /home/users/kavinash/QualityControl/NiftyNet/models/model_highres3dnet/models/model.ckpt-1120

Segmentation fault
ericspod commented 5 years ago

Is this perhaps an issue with your Tensorflow install? The segfault would indicate something lower down than Niftynet is the issue. What versions of Tensorflow and Python are you using? What hardware platform (I see it says XLA)?

koriavinash1 commented 5 years ago

Thanks for your reply, current environment settings are: Tensor flow 1.13.1 (did necessary edits in niftynet so it supports this version of tf) Python 3.7.2 Hardware: cpu cluster with 64 GB of RAM

ericspod commented 5 years ago

I'm afraid I think this is a technical issue on your setup. Please try to start inference again with Niftynet running on Tensorflow 1.12 and a regular desktop computer. We're working on getting the later versions of TF working and have encountered enough differences for it to be a challenge.

koriavinash1 commented 5 years ago

But I'm able to train the model with the same setup, an error occurs only with inference flag

ericspod commented 5 years ago

Which is really odd I admit, but I can't imagine what in Niftynet could possibly cause a segmentation fault. Since it's all Python it's an error in the Tensorflow library or something it interacts with below the Python level so doing this test is all I can think of. Trying it with CPU rather than on GPU might work also.

koriavinash1 commented 5 years ago

Thanks, I'll check that once and update

koriavinash1 commented 5 years ago

Now suddenly it starts to work after reinstalling all the packages