apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators
https://tvm.apache.org/
Apache License 2.0
11.75k stars 3.47k forks source link

Import the DLRM NET graph from pytorch to Relay failed #8272

Closed GoodNight-bye closed 3 years ago

GoodNight-bye commented 3 years ago

It seems that the type with arg name % 61 of aten:: index is nonetype,so i failed to call the interface relay.frontend.from_pytorch. How should I solve this problem?

ERROE INFO: Traceback (most recent call last): File "dlrm.py", line 146, in test_dlrm() File "dlrm.py", line 126, in test_dlrm mod, params = relay.frontend.from_pytorch(scripted_model, shape_list) File "./build/tvm/python/tvm/relay/frontend/pytorch.py", line 3309, in from_pytorch ret = converter.convert_operators(_get_operator_nodes(graph.nodes()), outputs, ret_name)[0] File "./build/tvm/python/tvm/relay/frontend/pytorch.py", line 2733, in convert_operators self.record_output_type(relay_out) File "./build/tvm/python/tvm/relay/frontend/pytorch.py", line 219, in record_output_type self.infer_type_with_prelude(output) File "./build/tvm/python/tvm/relay/frontend/pytorch.py", line 167, in infer_type_with_prelude body = self.infer_type(val, self.prelude.mod) File "./build/tvm/python/tvm/relay/frontend/pytorch.py", line 154, in infer_type new_node = tf.visit(node) File "./build/tvm/python/tvm/relay/frontend/pytorch.py", line 91, in visit v = super().visit(expr) File "./build/tvm/python/tvm/relay/expr_functor.py", line 48, in visit res = self.visit_call(expr) File "./build/tvm/python/tvm/relay/expr_functor.py", line 216, in visit_call new_args = [self.visit(arg) for arg in call.args] File "./build/tvm/python/tvm/relay/expr_functor.py", line 216, in new_args = [self.visit(arg) for arg in call.args] File "./build/tvm/python/tvm/relay/frontend/pytorch.py", line 91, in visit v = super().visit(expr) File "./build/tvm/python/tvm/relay/expr_functor.py", line 58, in visit res = self.visit_tuple(expr) File "./build/tvm/python/tvm/relay/expr_functor.py", line 229, in visit_tuple return Tuple([self.visit(field) for field in tup.fields], tup.span) File "./build/tvm/python/tvm/relay/expr_functor.py", line 229, in return Tuple([self.visit(field) for field in tup.fields], tup.span) File "./build/tvm/python/tvm/relay/frontend/pytorch.py", line 91, in visit v = super().visit(expr) File "./build/tvm/python/tvm/relay/expr_functor.py", line 77, in visit raise Exception("warning unhandled case: {0}".format(type(expr))) Exception: warning unhandled case: <class 'NoneType'>

TOTAL SCRIPTED MODEL GRAPH FROM PYTORCH:

graph(%self.1 : __torch__.dlrm_s_pytorch.DLRM_Net,
      %input.1 : Float(128, 13, strides=[13, 1], requires_grad=0, device=cpu),
      %ly : Float(128, 416, strides=[416, 1], requires_grad=0, device=cpu)):
  %3 : __torch__.torch.nn.modules.container.___torch_mangle_11.Sequential = prim::GetAttr[name="top_l"](%self.1)
  %4 : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="bot_l"](%self.1)
  %68 : __torch__.torch.nn.modules.activation.___torch_mangle_5.ReLU = prim::GetAttr[name="7"](%4)
  %69 : __torch__.torch.nn.modules.linear.___torch_mangle_4.Linear = prim::GetAttr[name="6"](%4)
  %70 : __torch__.torch.nn.modules.activation.___torch_mangle_3.ReLU = prim::GetAttr[name="5"](%4)
  %71 : __torch__.torch.nn.modules.linear.___torch_mangle_2.Linear = prim::GetAttr[name="4"](%4)
  %72 : __torch__.torch.nn.modules.activation.___torch_mangle_1.ReLU = prim::GetAttr[name="3"](%4)
  %73 : __torch__.torch.nn.modules.linear.___torch_mangle_0.Linear = prim::GetAttr[name="2"](%4)
  %74 : __torch__.torch.nn.modules.activation.ReLU = prim::GetAttr[name="1"](%4)
  %75 : __torch__.torch.nn.modules.linear.Linear = prim::GetAttr[name="0"](%4)
  %76 : Tensor = prim::GetAttr[name="bias"](%75)
  %77 : Tensor = prim::GetAttr[name="weight"](%75)
  %78 : Float(13, 512, strides=[1, 13], requires_grad=1, device=cpu) = aten::t(%77), scope: __module.bot_l/__module.bot_l.0 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %79 : int = prim::Constant[value=1](), scope: __module.bot_l/__module.bot_l.0 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %80 : int = prim::Constant[value=1](), scope: __module.bot_l/__module.bot_l.0 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.2 : Float(128, 512, strides=[512, 1], requires_grad=1, device=cpu) = aten::addmm(%76, %input.1, %78, %79, %80), scope: __module.bot_l/__module.bot_l.0 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.3 : Float(128, 512, strides=[512, 1], requires_grad=1, device=cpu) = aten::relu(%input.2), scope: __module.bot_l/__module.bot_l.1 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1206:0
  %83 : Tensor = prim::GetAttr[name="bias"](%73)
  %84 : Tensor = prim::GetAttr[name="weight"](%73)
  %85 : Float(512, 256, strides=[1, 512], requires_grad=1, device=cpu) = aten::t(%84), scope: __module.bot_l/__module.bot_l.2 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %86 : int = prim::Constant[value=1](), scope: __module.bot_l/__module.bot_l.2 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %87 : int = prim::Constant[value=1](), scope: __module.bot_l/__module.bot_l.2 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.4 : Float(128, 256, strides=[256, 1], requires_grad=1, device=cpu) = aten::addmm(%83, %input.3, %85, %86, %87), scope: __module.bot_l/__module.bot_l.2 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.5 : Float(128, 256, strides=[256, 1], requires_grad=1, device=cpu) = aten::relu(%input.4), scope: __module.bot_l/__module.bot_l.3 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1206:0
  %90 : Tensor = prim::GetAttr[name="bias"](%71)
  %91 : Tensor = prim::GetAttr[name="weight"](%71)
  %92 : Float(256, 64, strides=[1, 256], requires_grad=1, device=cpu) = aten::t(%91), scope: __module.bot_l/__module.bot_l.4 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %93 : int = prim::Constant[value=1](), scope: __module.bot_l/__module.bot_l.4 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %94 : int = prim::Constant[value=1](), scope: __module.bot_l/__module.bot_l.4 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.6 : Float(128, 64, strides=[64, 1], requires_grad=1, device=cpu) = aten::addmm(%90, %input.5, %92, %93, %94), scope: __module.bot_l/__module.bot_l.4 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.7 : Float(128, 64, strides=[64, 1], requires_grad=1, device=cpu) = aten::relu(%input.6), scope: __module.bot_l/__module.bot_l.5 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1206:0
  %97 : Tensor = prim::GetAttr[name="bias"](%69)
  %98 : Tensor = prim::GetAttr[name="weight"](%69)
  %99 : Float(64, 16, strides=[1, 64], requires_grad=1, device=cpu) = aten::t(%98), scope: __module.bot_l/__module.bot_l.6 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %100 : int = prim::Constant[value=1](), scope: __module.bot_l/__module.bot_l.6 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %101 : int = prim::Constant[value=1](), scope: __module.bot_l/__module.bot_l.6 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.8 : Float(128, 16, strides=[16, 1], requires_grad=1, device=cpu) = aten::addmm(%97, %input.7, %99, %100, %101), scope: __module.bot_l/__module.bot_l.6 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %x : Float(128, 16, strides=[16, 1], requires_grad=1, device=cpu) = aten::relu(%input.8), scope: __module.bot_l/__module.bot_l.7 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1206:0
  %6 : int = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:477:0
  %7 : int = aten::size(%x, %6) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:477:0
  %batch_size : Long(device=cpu) = prim::NumToTensor(%7)
  %9 : int = aten::Int(%batch_size)
  %10 : int = prim::Constant[value=1]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:477:0
  %11 : int = aten::size(%x, %10) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:477:0
  %d : Long(device=cpu) = prim::NumToTensor(%11)
  %13 : int = aten::Int(%d)
  %14 : Tensor[] = prim::ListConstruct(%x, %ly)
  %15 : int = prim::Constant[value=1]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:478:0
  %16 : Float(128, 432, strides=[432, 1], requires_grad=1, device=cpu) = aten::cat(%14, %15) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:478:0
  %17 : int = prim::Constant[value=-1]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:478:0
  %18 : int[] = prim::ListConstruct(%9, %17, %13)
  %T : Float(128, 27, 16, strides=[432, 16, 1], requires_grad=1, device=cpu) = aten::view(%16, %18) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:478:0
  %20 : int = prim::Constant[value=1]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:480:0
  %21 : int = prim::Constant[value=2]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:480:0
  %22 : Float(128, 16, 27, strides=[432, 1, 16], requires_grad=1, device=cpu) = aten::transpose(%T, %20, %21) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:480:0
  %Z : Float(128, 27, 27, strides=[729, 27, 1], requires_grad=1, device=cpu) = aten::bmm(%T, %22) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:480:0
  %24 : Long(351, strides=[1], requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:491:0
  %25 : Device = prim::Constant[value="cpu"]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:491:0
  %26 : int = prim::Constant[value=4]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:491:0
  %27 : bool = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:491:0
  %28 : bool = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:491:0
  %29 : None = prim::Constant()
  %30 : Long(351, strides=[1], requires_grad=0, device=cpu) = aten::to(%24, %25, %26, %27, %28, %29) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:491:0
  %li.1 : Long(351, strides=[1], requires_grad=0, device=cpu) = aten::detach(%30) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:491:0
  %32 : Long(351, strides=[1], requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:492:0
  %33 : Device = prim::Constant[value="cpu"]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:492:0
  %34 : int = prim::Constant[value=4]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:492:0
  %35 : bool = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:492:0
  %36 : bool = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:492:0
  %37 : None = prim::Constant()
  %38 : Long(351, strides=[1], requires_grad=0, device=cpu) = aten::to(%32, %33, %34, %35, %36, %37) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:492:0
  %lj.1 : Long(351, strides=[1], requires_grad=0, device=cpu) = aten::detach(%38) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:492:0
  %40 : int = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %41 : int = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %42 : int = prim::Constant[value=9223372036854775807]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %43 : int = prim::Constant[value=1]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %44 : Float(128, 27, 27, strides=[729, 27, 1], requires_grad=1, device=cpu) = aten::slice(%Z, %40, %41, %42, %43) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %45 : int = prim::Constant[value=4]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %46 : int = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %47 : Device = prim::Constant[value="cpu"]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %48 : None = prim::Constant()
  %49 : bool = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %50 : bool = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %51 : None = prim::Constant()
  %li : Long(351, strides=[1], requires_grad=0, device=cpu) = aten::to(%li.1, %45, %46, %47, %48, %49, %50, %51) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %53 : int = prim::Constant[value=4]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %54 : int = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %55 : Device = prim::Constant[value="cpu"]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %56 : None = prim::Constant()
  %57 : bool = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %58 : bool = prim::Constant[value=0]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %59 : None = prim::Constant()
  %lj : Long(351, strides=[1], requires_grad=0, device=cpu) = aten::to(%lj.1, %53, %54, %55, %56, %57, %58, %59) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %61 : None = prim::Constant()
  %62 : Tensor?[] = prim::ListConstruct(%61, %li, %lj)
  %Zflat : Float(128, 351, strides=[351, 1], requires_grad=1, device=cpu) = aten::index(%44, %62) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:493:0
  %64 : Tensor[] = prim::ListConstruct(%x, %Zflat)
  %65 : int = prim::Constant[value=1]() # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:496:0
  %input.9 : Float(128, 367, strides=[367, 1], requires_grad=1, device=cpu) = aten::cat(%64, %65) # ./code/git/trunk/tensorxt/tests/graph/dlrm/dlrm_s_pytorch.py:496:0
  %104 : __torch__.torch.nn.modules.activation.Sigmoid = prim::GetAttr[name="5"](%3)
  %105 : __torch__.torch.nn.modules.linear.___torch_mangle_10.Linear = prim::GetAttr[name="4"](%3)
  %106 : __torch__.torch.nn.modules.activation.___torch_mangle_9.ReLU = prim::GetAttr[name="3"](%3)
  %107 : __torch__.torch.nn.modules.linear.___torch_mangle_8.Linear = prim::GetAttr[name="2"](%3)
  %108 : __torch__.torch.nn.modules.activation.___torch_mangle_7.ReLU = prim::GetAttr[name="1"](%3)
  %109 : __torch__.torch.nn.modules.linear.___torch_mangle_6.Linear = prim::GetAttr[name="0"](%3)
  %110 : Tensor = prim::GetAttr[name="bias"](%109)
  %111 : Tensor = prim::GetAttr[name="weight"](%109)
  %112 : Float(367, 512, strides=[1, 367], requires_grad=1, device=cpu) = aten::t(%111), scope: __module.top_l/__module.top_l.0 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %113 : int = prim::Constant[value=1](), scope: __module.top_l/__module.top_l.0 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %114 : int = prim::Constant[value=1](), scope: __module.top_l/__module.top_l.0 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.10 : Float(128, 512, strides=[512, 1], requires_grad=1, device=cpu) = aten::addmm(%110, %input.9, %112, %113, %114), scope: __module.top_l/__module.top_l.0 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.11 : Float(128, 512, strides=[512, 1], requires_grad=1, device=cpu) = aten::relu(%input.10), scope: __module.top_l/__module.top_l.1 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1206:0
  %117 : Tensor = prim::GetAttr[name="bias"](%107)
  %118 : Tensor = prim::GetAttr[name="weight"](%107)
  %119 : Float(512, 256, strides=[1, 512], requires_grad=1, device=cpu) = aten::t(%118), scope: __module.top_l/__module.top_l.2 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %120 : int = prim::Constant[value=1](), scope: __module.top_l/__module.top_l.2 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %121 : int = prim::Constant[value=1](), scope: __module.top_l/__module.top_l.2 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.12 : Float(128, 256, strides=[256, 1], requires_grad=1, device=cpu) = aten::addmm(%117, %input.11, %119, %120, %121), scope: __module.top_l/__module.top_l.2 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input.13 : Float(128, 256, strides=[256, 1], requires_grad=1, device=cpu) = aten::relu(%input.12), scope: __module.top_l/__module.top_l.3 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1206:0
  %124 : Tensor = prim::GetAttr[name="bias"](%105)
  %125 : Tensor = prim::GetAttr[name="weight"](%105)
  %126 : Float(256, 1, strides=[1, 256], requires_grad=1, device=cpu) = aten::t(%125), scope: __module.top_l/__module.top_l.4 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %127 : int = prim::Constant[value=1](), scope: __module.top_l/__module.top_l.4 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %128 : int = prim::Constant[value=1](), scope: __module.top_l/__module.top_l.4 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %input : Float(128, 1, strides=[1, 1], requires_grad=1, device=cpu) = aten::addmm(%124, %input.13, %126, %127, %128), scope: __module.top_l/__module.top_l.4 # ./.local/lib/python3.6/site-packages/torch/nn/functional.py:1753:0
  %130 : Float(128, 1, strides=[1, 1], requires_grad=1, device=cpu) = aten::sigmoid(%input), scope: __module.top_l/__module.top_l.5 # ./.local/lib/python3.6/site-packages/torch/nn/modules/activation.py:299:0
  return (%130)

DLRM source code: image

electriclilies commented 3 years ago

Hi @GoodNight-bye, for general debugging, it would be better for you to open create a discussion post where the community can give input about what is wrong. If you have a found a specific bug, then you should create a tracking issue.

GoodNight-bye commented 3 years ago

@electriclilies ,thank you for your advice, I create a new issue and provide a minimum script to reproduce the problem,Look forward to your favourable reply https://github.com/apache/tvm/issues/8374