NifTK / NiftyNet

[unmaintained] An open-source convolutional neural networks platform for research in medical image analysis and image-guided therapy
http://niftynet.io
Apache License 2.0
1.36k stars 404 forks source link

cannot share and inference niftynet trained model #326

Open yuanpeng5 opened 5 years ago

yuanpeng5 commented 5 years ago

I am trying to deploy niftynet trained model in tensorflow serving. But I have encountered the following problems: (1) I cannot freeze the model with freeze_graph.py provided in tensorflow. (2) It is really hard to find the input tensors name and output tensors name in the graph. Is there anyway I can easily freeze niftynet trained model and deploy it? Right now I can only evaluate the trained model under niftynet framework, it is hard to inference it in other places.

I have also attached my model files in the issue. If anyone can tell me how can I find the correct input and output tensors and deploy the model, it will be very much appreciated. Thanks in advance.

new_malig.zip

yuanpeng5 commented 5 years ago

Updates: I have managed to find the network input and output tensor names:

"worker_0/cond/validation/Squeeze:0"
"worker_0/MaligNet/conv_output_bn/bn_/batch_norm/add_1:0"

But when I use these two tensors as input and output tensor, the network crashed during inference with error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: ValueError: callback pyfunc_0 is not found
Traceback (most recent call last):

  File "/home/pwu/dp_server_env/lib/python3.5/site-packages/tensorflow/python/ops/script_ops.py", line 195, in __call__
    raise ValueError("callback %s is not found" % token)

ValueError: callback pyfunc_0 is not found

         [[Node: PyFunc = PyFunc[Tin=[DT_INT64], Tout=[DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], token="pyfunc_0", _device="/device:GPU:0"](arg0)]]
         [[Node: worker_0/cond/train/IteratorGetNext = IteratorGetNext[output_shapes=[[20,32,32,32,1,1], [20,7], [20,1,1,1,1,1], [20,7]], output_types=[DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"](worker_0/cond/train/OneShotIterator)]]

@wyli Any idea? Is it because the ckpt didn't stored pyfunc_0 into it? How could NiftyNet itself can run through it?

wyli commented 5 years ago

Hi @yuanpeng5 have you looked at this example? https://github.com/NifTK/NiftyNet/issues/254#issuecomment-431793018 which still requires niftynet's IO though.

yuanpeng5 commented 5 years ago

@wyli thanks a lot I will try that, btw I found that there is a merge tensor that merge all the workers tensor together, and that is exactly the one that causing trouble at the inference time. If skip that one and use the merged tensor as the input tensor, everything works fine now.

wyli commented 5 years ago

@yuanpeng5 cool, could you give more details of this inference model deployment? could be a generic feature to have in the upstream. thanks!

guigautier commented 5 years ago

Hi, first thanks a lot for niftynet framework. I would like to deploy niftynet trained model in tensorflow serving. I used the freeze_graph function to freeze model and weights in a pb file (need to know the output node names) and train on only one GPU (with multiple GPUs can try multiple --output_node_names out1, out2 ). The model works fine, but the tensor input shape is fixed by NiftyNet BaseApplication due to patch-based train/inference. I tried to use dynamic shape (-1,-1,-1) but I need to reshape all my input data loosing interest of Niftynet automatic window sampler. Do you know if it's possible to train model with a "None" input shape (like tf.placeholder_with_default() ) and keep the NiftyNet data preprocessing ? Thanks

ericspod commented 5 years ago

There doesn't appear to be currently, we can add this as an enhancement item.

helwilliams commented 5 years ago

Updates: I have managed to find the network input and output tensor names:

"worker_0/cond/validation/Squeeze:0"
"worker_0/MaligNet/conv_output_bn/bn_/batch_norm/add_1:0"

But when I use these two tensors as input and output tensor, the network crashed during inference with error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: ValueError: callback pyfunc_0 is not found
Traceback (most recent call last):

  File "/home/pwu/dp_server_env/lib/python3.5/site-packages/tensorflow/python/ops/script_ops.py", line 195, in __call__
    raise ValueError("callback %s is not found" % token)

ValueError: callback pyfunc_0 is not found

         [[Node: PyFunc = PyFunc[Tin=[DT_INT64], Tout=[DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], token="pyfunc_0", _device="/device:GPU:0"](arg0)]]
         [[Node: worker_0/cond/train/IteratorGetNext = IteratorGetNext[output_shapes=[[20,32,32,32,1,1], [20,7], [20,1,1,1,1,1], [20,7]], output_types=[DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"](worker_0/cond/train/OneShotIterator)]]

@wyli Any idea? Is it because the ckpt didn't stored pyfunc_0 into it? How could NiftyNet itself can run through it?

Updates: I have managed to find the network input and output tensor names:

"worker_0/cond/validation/Squeeze:0"
"worker_0/MaligNet/conv_output_bn/bn_/batch_norm/add_1:0"

But when I use these two tensors as input and output tensor, the network crashed during inference with error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: ValueError: callback pyfunc_0 is not found
Traceback (most recent call last):

  File "/home/pwu/dp_server_env/lib/python3.5/site-packages/tensorflow/python/ops/script_ops.py", line 195, in __call__
    raise ValueError("callback %s is not found" % token)

ValueError: callback pyfunc_0 is not found

         [[Node: PyFunc = PyFunc[Tin=[DT_INT64], Tout=[DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], token="pyfunc_0", _device="/device:GPU:0"](arg0)]]
         [[Node: worker_0/cond/train/IteratorGetNext = IteratorGetNext[output_shapes=[[20,32,32,32,1,1], [20,7], [20,1,1,1,1,1], [20,7]], output_types=[DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"](worker_0/cond/train/OneShotIterator)]]

@wyli Any idea? Is it because the ckpt didn't stored pyfunc_0 into it? How could NiftyNet itself can run through it?

Hello I know this was a long time ago, but I am jsut wondering how you found the input/output nodes- i am trying to freeze my own graph myself to export tf to onnx and am struggling finding the right one.