Project-MONAI / model-zoo

MONAI Model Zoo that hosts models in the MONAI Bundle format.
Apache License 2.0
173 stars 63 forks source link

trt_export error in "lung_nodule_ct_detection" #568

Closed KumoLiu closed 2 months ago

KumoLiu commented 3 months ago
03/15/2024 02:05:41 AM bundle_test _run_commands INFO: Executing export PYTHONPATH=$PYTHONPATH:/workspace/lung_nodule_ct_detection && echo $PYTHONPATH && python -m monai.bundle trt_export --net_id network_def --filepath /workspace/lung_nodule_ct_detection/models/model_trt.ts --ckpt_file /workspace/lung_nodule_ct_detection/models/model.pt --precision fp16 --meta_file /workspace/lung_nodule_ct_detection/configs/metadata.json --config_file /workspace/lung_nodule_ct_detection/configs/inference.json --logging_file /workspace/lung_nodule_ct_detection/configs/logging.conf  --input_shape "[1, 1, 512, 512, 192]" --use_onnx "True" --use_trace "True" --onnx_output_names "['output_0', 'output_1', 'output_2', 'output_3', 'output_4', 'output_5']" --network_def#use_list_output "True"  2>&1 | tee running.log
:/workspace/lung_nodule_ct_detection
2024-03-15 02:05:52,411 - INFO - --- input summary of monai.bundle.scripts.trt_export ---
2024-03-15 02:05:52,411 - INFO - > net_id: 'network_def'
2024-03-15 02:05:52,411 - INFO - > filepath: '/workspace/lung_nodule_ct_detection/models/model_trt.ts'
2024-03-15 02:05:52,411 - INFO - > meta_file: '/workspace/lung_nodule_ct_detection/configs/metadata.json'
2024-03-15 02:05:52,411 - INFO - > config_file: '/workspace/lung_nodule_ct_detection/configs/inference.json'
2024-03-15 02:05:52,411 - INFO - > ckpt_file: '/workspace/lung_nodule_ct_detection/models/model.pt'
2024-03-15 02:05:52,411 - INFO - > precision: 'fp16'
2024-03-15 02:05:52,411 - INFO - > input_shape: [1, 1, 512, 512, 192]
2024-03-15 02:05:52,411 - INFO - > use_trace: True
2024-03-15 02:05:52,411 - INFO - > use_onnx: True
2024-03-15 02:05:52,411 - INFO - > onnx_output_names: ['output_0', 'output_1', 'output_2', 'output_3', 'output_4', 'output_5']
2024-03-15 02:05:52,411 - INFO - > logging_file: '/workspace/lung_nodule_ct_detection/configs/logging.conf'
2024-03-15 02:05:52,411 - INFO - > network_def#use_list_output: True
2024-03-15 02:05:52,412 - INFO - ---

There is no dynamic batch range. The converted model only takes [1, 1, 512, 512, 192] shape input.
Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
Encountering a list at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. for `list`, use a `tuple` instead. for `dict`, use a `NamedTuple` instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.
[03/15/2024-02:06:57] [TRT] [W] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[03/15/2024-02:11:40] [TRT] [E] 10: Could not find any implementation for node {ForeignNode[/classification_head/conv/conv.1/Constant_1_output_0 + (Unnamed Layer* 130) [Shuffle].../regression_head/conv/conv.2/Relu]}.
[03/15/2024-02:11:40] [TRT] [E] 10: [optimizer.cpp::computeCosts::3869] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[/classification_head/conv/conv.1/Constant_1_output_0 + (Unnamed Layer* 130) [Shuffle].../regression_head/conv/conv.2/Relu]}.)
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/monai/bundle/__main__.py", line 31, in <module>
    fire.Fire()
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/monai/bundle/scripts.py", line 1516, in trt_export
    _export(
  File "/usr/local/lib/python3.10/dist-packages/monai/bundle/scripts.py", line 1091, in _export
    net = converter(model=net, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/monai/networks/utils.py", line 965, in convert_to_trt
    trt_model = _onnx_trt_compile(
  File "/usr/local/lib/python3.10/dist-packages/monai/networks/utils.py", line 849, in _onnx_trt_compile
    f.write(serialized_engine)
TypeError: a bytes-like object is required, not 'NoneType'
binliunls commented 3 months ago

I think this line here is the reason:

[03/15/2024-02:11:40] [TRT] [E] 10: Could not find any implementation for node {ForeignNode[/classification_head/conv/conv.1/Constant_1_output_0 + (Unnamed Layer* 130) [Shuffle].../regression_head/conv/conv.2/Relu]}.

The serialized engine is not generated due to an unsupported operator.

binliunls commented 3 months ago

This one is about the wrong driver version. Please update the driver to the latest one.