pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.54k stars 349 forks source link

šŸ› [Bug] DLRM model conversion failed #662

Closed vinhngx closed 2 years ago

vinhngx commented 2 years ago

Bug Description

DLRM model from NGC (https://ngc.nvidia.com/catalog/models) conversion failed. WIP prototype here: /mnt/nvdl/usr/vinhn/trtorch-perf-benchmar/DLRM

Several unsupported ops:

ERROR: Method requested cannot be compiled by TRTorch.
Unsupported operators listed below:
  - aten::index.Tensor(Tensor self, Tensor?[] indices) -> (Tensor)
  - aten::to.dtype_layout(Tensor self, *, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None, bool non_blocking=False, bool copy=False, int? memory_format=None) -> (Tensor)
You can either implement converters for these ops in your application or request implementation
https://www.github.com/nvidia/TRTorch/issues

In Module:

ERROR: Unsupported operator: aten::to.dtype_layout(Tensor self, *, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None, bool non_blocking=False, bool copy=False, int? memory_format=None) -> (Tensor)
/workspace/dlrm/dlrm/nn/interactions.py(64): interact
/workspace/dlrm/dlrm/nn/parts.py(134): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1003): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1015): _call_impl
/workspace/dlrm/dlrm/model/single.py(81): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1003): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1015): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(950): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
<ipython-input-15-e6a5d3ce063e>(1): <module>
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3437): run_code
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3357): run_ast_nodes
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3165): run_cell_async
/opt/conda/lib/python3.8/site-packages/IPython/core/async_helpers.py(68): _pseudo_sync_runner
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(2940): _run_cell
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(2894): run_cell
/opt/conda/lib/python3.8/site-packages/ipykernel/zmqshell.py(539): run_cell
/opt/conda/lib/python3.8/site-packages/ipykernel/ipkernel.py(302): do_execute
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(536): execute_request
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(261): dispatch_shell
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(358): process_one
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(775): run
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(814): inner
/opt/conda/lib/python3.8/site-packages/tornado/ioloop.py(741): _run_callback
/opt/conda/lib/python3.8/site-packages/tornado/ioloop.py(688): <lambda>
/opt/conda/lib/python3.8/asyncio/events.py(81): _run
/opt/conda/lib/python3.8/asyncio/base_events.py(1859): _run_once
/opt/conda/lib/python3.8/asyncio/base_events.py(570): run_forever
/opt/conda/lib/python3.8/site-packages/tornado/platform/asyncio.py(199): start
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelapp.py(612): start
/opt/conda/lib/python3.8/site-packages/traitlets/config/application.py(845): launch_instance
/opt/conda/lib/python3.8/site-packages/ipykernel_launcher.py(16): <module>
/opt/conda/lib/python3.8/runpy.py(87): _run_code
/opt/conda/lib/python3.8/runpy.py(194): _run_module_as_main
Serialized   File "code/__torch__/dlrm/nn/parts.py", line 34
    _7 = torch.select(CONSTANTS.c0, 0, 1)
    _8 = torch.slice(interaction, 0, 0, 9223372036854775807, 1)
    _9 = torch.to(_6, dtype=4, layout=0, device=torch.device("cuda:0"), pin_memory=None, non_blocking=False, copy=False, memory_format=None)
         ~~~~~~~~ <--- HERE
    _10 = torch.to(_7, dtype=4, layout=0, device=torch.device("cuda:0"), pin_memory=None, non_blocking=False, copy=False, memory_format=None)
    _11 = annotate(List[Optional[Tensor]], [None, _9, _10])

ERROR: Unsupported operator: aten::to.dtype_layout(Tensor self, *, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None, bool non_blocking=False, bool copy=False, int? memory_format=None) -> (Tensor)
/workspace/dlrm/dlrm/nn/interactions.py(64): interact
/workspace/dlrm/dlrm/nn/parts.py(134): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1003): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1015): _call_impl
/workspace/dlrm/dlrm/model/single.py(81): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1003): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1015): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(950): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
<ipython-input-15-e6a5d3ce063e>(1): <module>
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3437): run_code
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3357): run_ast_nodes
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3165): run_cell_async
/opt/conda/lib/python3.8/site-packages/IPython/core/async_helpers.py(68): _pseudo_sync_runner
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(2940): _run_cell
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(2894): run_cell
/opt/conda/lib/python3.8/site-packages/ipykernel/zmqshell.py(539): run_cell
/opt/conda/lib/python3.8/site-packages/ipykernel/ipkernel.py(302): do_execute
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(536): execute_request
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(261): dispatch_shell
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(358): process_one
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(775): run
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(814): inner
/opt/conda/lib/python3.8/site-packages/tornado/ioloop.py(741): _run_callback
/opt/conda/lib/python3.8/site-packages/tornado/ioloop.py(688): <lambda>
/opt/conda/lib/python3.8/asyncio/events.py(81): _run
/opt/conda/lib/python3.8/asyncio/base_events.py(1859): _run_once
/opt/conda/lib/python3.8/asyncio/base_events.py(570): run_forever
/opt/conda/lib/python3.8/site-packages/tornado/platform/asyncio.py(199): start
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelapp.py(612): start
/opt/conda/lib/python3.8/site-packages/traitlets/config/application.py(845): launch_instance
/opt/conda/lib/python3.8/site-packages/ipykernel_launcher.py(16): <module>
/opt/conda/lib/python3.8/runpy.py(87): _run_code
/opt/conda/lib/python3.8/runpy.py(194): _run_module_as_main
Serialized   File "code/__torch__/dlrm/nn/parts.py", line 35
    _8 = torch.slice(interaction, 0, 0, 9223372036854775807, 1)
    _9 = torch.to(_6, dtype=4, layout=0, device=torch.device("cuda:0"), pin_memory=None, non_blocking=False, copy=False, memory_format=None)
    _10 = torch.to(_7, dtype=4, layout=0, device=torch.device("cuda:0"), pin_memory=None, non_blocking=False, copy=False, memory_format=None)
          ~~~~~~~~ <--- HERE
    _11 = annotate(List[Optional[Tensor]], [None, _9, _10])
    interaction_flat = torch.index(_8, _11)

ERROR: Unsupported operator: aten::index.Tensor(Tensor self, Tensor?[] indices) -> (Tensor)
/workspace/dlrm/dlrm/nn/interactions.py(64): interact
/workspace/dlrm/dlrm/nn/parts.py(134): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1003): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1015): _call_impl
/workspace/dlrm/dlrm/model/single.py(81): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1003): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1015): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(950): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
<ipython-input-15-e6a5d3ce063e>(1): <module>
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3437): run_code
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3357): run_ast_nodes
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(3165): run_cell_async
/opt/conda/lib/python3.8/site-packages/IPython/core/async_helpers.py(68): _pseudo_sync_runner
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(2940): _run_cell
/opt/conda/lib/python3.8/site-packages/IPython/core/interactiveshell.py(2894): run_cell
/opt/conda/lib/python3.8/site-packages/ipykernel/zmqshell.py(539): run_cell
/opt/conda/lib/python3.8/site-packages/ipykernel/ipkernel.py(302): do_execute
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(536): execute_request
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(261): dispatch_shell
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(234): wrapper
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelbase.py(358): process_one
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(775): run
/opt/conda/lib/python3.8/site-packages/tornado/gen.py(814): inner
/opt/conda/lib/python3.8/site-packages/tornado/ioloop.py(741): _run_callback
/opt/conda/lib/python3.8/site-packages/tornado/ioloop.py(688): <lambda>
/opt/conda/lib/python3.8/asyncio/events.py(81): _run
/opt/conda/lib/python3.8/asyncio/base_events.py(1859): _run_once
/opt/conda/lib/python3.8/asyncio/base_events.py(570): run_forever
/opt/conda/lib/python3.8/site-packages/tornado/platform/asyncio.py(199): start
/opt/conda/lib/python3.8/site-packages/ipykernel/kernelapp.py(612): start
/opt/conda/lib/python3.8/site-packages/traitlets/config/application.py(845): launch_instance
/opt/conda/lib/python3.8/site-packages/ipykernel_launcher.py(16): <module>
/opt/conda/lib/python3.8/runpy.py(87): _run_code
/opt/conda/lib/python3.8/runpy.py(194): _run_module_as_main
Serialized   File "code/__torch__/dlrm/nn/parts.py", line 37
    _10 = torch.to(_7, dtype=4, layout=0, device=torch.device("cuda:0"), pin_memory=None, non_blocking=False, copy=False, memory_format=None)
    _11 = annotate(List[Optional[Tensor]], [None, _9, _10])
    interaction_flat = torch.index(_8, _11)
                       ~~~~~~~~~~~ <--- HERE
    zeros_padding = torch.zeros([_5, 1], dtype=6, layout=None, device=torch.device("cuda:0"), pin_memory=False)
    _12 = [argument_2, interaction_flat, zeros_padding]

ERROR: Module is not currently supported by TRTorch

Fallback also failed with:

RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CUDAFloatType instead (while checking arguments for embedding)

To Reproduce

Steps to reproduce the behavior:

  1. Execute notebooks at /mnt/nvdl/usr/vinhn/trtorch-perf-benchmar/DLRM

Expected behavior

Environment

Build information about the TRTorch compiler can be found by turning on debug messages

Additional context

ncomly-nvidia commented 2 years ago

@narendasan, you mentioned this should be fixed. Can you please point to the PR which does so & close this issue?

github-actions[bot] commented 2 years ago

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] commented 2 years ago

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] commented 2 years ago

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

ncomly-nvidia commented 2 years ago

@narendasan have we verified DLRM?