Closed jefflgaol closed 4 years ago
Check if you have any data in testing
folder
I also have data in testing and evaluation folders.
can you post the config.json
for me to take a look?
this error refers to tensorflow dataset api can process the data properlly, if possible, paste the full error output log
This is my config.json:
{
"ProjectName": "VNet Tensorflow",
"ProjectDetail": {
"BodyPart": "Liver",
"Diseases": "Lesion"
},
"TrainingSetting": {
"Data": {
"TrainingDataDirectory":"./data/training",
"TestingDataDirectory": "./data/testing",
"ImageFilenames": ["img.nii.gz"],
"LabelFilename": "label.nii.gz"
},
"Restore": true,
"SegmentationClasses": [0,1,2],
"LogDir": "./tmp/log",
"CheckpointDir": "./tmp/ckpt",
"BatchSize": 32,
"PatchShape": [256,256,32],
"ImageLog": false,
"Testing": false,
"TestStep": 30,
"Epoches": 99999,
"MaxIterations": 15000,
"LogInterval": 25,
"Networks": {
"Name":"VNet",
"Dropout": 0.01
},
"Loss": "weighted_sorensen",
"Optimizer":{
"Name": "Adam",
"InitialLearningRate": 1e-2,
"Momentum":0.9,
"Decay":{
"Factor": 0.99,
"Steps": 100
}
},
"Spacing": [0.75,0.75,0.75],
"DropRatio": 0.01,
"MinPixel":30
},
"EvaluationSetting":{
"Data":{
"EvaluateDataDirectory": "./data/evaluate",
"ImageFilenames": ["img.nii.gz"],
"LabelFilename": "label.nii.gz",
"ProbabilityFilename": "probability_tf.nii.gz"
},
"CheckpointPath": "./tmp/ckpt/checkpoint-0",
"Stride": [256,256,32],
"BatchSize": 1,
"ProbabilityOutput":false
}
}
and this is my log:
python3 main.py --config_json config.json --gpu 1
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-03-21 20:47:38.062457: Reading configuration file...
2020-03-21 20:47:38.062556: Reading configuration file complete
2020-03-21 20:47:38.062581: Start to build model graph...
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
iterator
WARNING:tensorflow:From /home/cwlab913/vnet-tensorflow/NiftiDataset3D.py:45: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
tf.py_function, which takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means `tf.py_function`s can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
2020-03-21 20:47:38.093575: Dataset pipeline complete
2020-03-21 20:47:38.093926: Core network complete
WARNING:tensorflow:From /home/cwlab913/vnet-tensorflow/networks.py:259: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.batch_normalization instead.
2020-03-21 20:47:40.752396: Output layers complete
2020-03-21 20:47:40.789966: Loss function complete
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:1472: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2020-03-21 20:47:40.884969: Metrics complete
2020-03-21 20:47:40.885004: Build graph complete
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2020-03-21 20:47:48.697176: Start training...
2020-03-21 20:47:48.697257: Setting up Saver...
2020-03-21 20:47:49.023796: Last checkpoint epoch: 0
2020-03-21 20:47:49.404576: Last checkpoint global step: 0
2020-03-21 20:47:52.088255: Epoch 1 starts...
2020-03-21 20:47:52.617702: Set network to training ok
./data/training/case1/img.nii.gz
./data/training/case5/img.nii.gz
./data/training/case3/img.nii.gz
./data/training/case2/img.nii.gz
./data/training/case4/img.nii.gz
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[{{node IteratorGetNext}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cwlab913/vnet-tensorflow/model.py", line 639, in train
image, label = self.sess.run(self.next_element_train)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[node IteratorGetNext (defined at /home/cwlab913/vnet-tensorflow/model.py:327) ]]
Caused by op 'IteratorGetNext', defined at:
File "main.py", line 83, in <module>
main(args)
File "main.py", line 75, in main
model.train()
File "/home/cwlab913/vnet-tensorflow/model.py", line 542, in train
self.build_model_graph()
File "/home/cwlab913/vnet-tensorflow/model.py", line 327, in build_model_graph
self.next_element_train = self.train_iterator.get_next()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 414, in get_next
output_shapes=self._structure._flat_shapes, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_dataset_ops.py", line 1685, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
OutOfRangeError (see above for traceback): End of sequence
[[node IteratorGetNext (defined at /home/cwlab913/vnet-tensorflow/model.py:327) ]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 83, in <module>
main(args)
File "main.py", line 75, in main
model.train()
File "/home/cwlab913/vnet-tensorflow/model.py", line 698, in train
print("{}: Training of epoch {} complete, epoch loss: {}".format(datetime.datetime.now(),epoch+1,loss_sum/count))
ZeroDivisionError: division by zero
I think you only get 5 images but your batch size is 32, this can't form one full image batch
change it to a smaller size like 1 or for it is a 3D training, it will consume quite a lot of GPU memory if batch size is too big.
Gosh! I forgot to set that to the correct batch size. Thank you very much!
Hi, there! Amazing work you have here. But I have a question. I tried to run your main.py like this:
Unfortunately, the terminal showed several issues:
So, I tried to print the the path produced by def input_parser from NiftiDataset3D like this:
and the result is also fine:
Do you have any insights for these issues? Notes: Currently, I am using Tensorflow v.1.13.1.