[PyTorch] deformable_conv2 error when converting torch traced model to relay

sacalo commented 3 years ago

After converting a pytorch model to torchscript using the tracing method, I can successfully execute it and make inferences. But when trying to convert the traced model following this code, it fails with the attached traceback error. I can see that is has to do with the "deformable_conv2d" but I'm not able to follow the cause deeper.

using the scripted_model (works OK):

import cv2
import torch

with open("./model.ts", "rb") as f:
    ts_model = torch.jit.load(f)

image = cv2.imread("./test.jpg")
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
result = ts_model(image)

converting scripted_model to relay (fails with the error posted):

import tvm
from tvm import relay
import torch
import torchvision

import cv2

with open("./model.ts", "rb") as f:
    scripted_model = torch.jit.load(f)

img = cv2.imread("./test.jpg")
input = torch.as_tensor(img.astype("float32").transpose(2, 0, 1))
input_name = "input0"
shape_list = [(input_name, input.shape)]
mod, params = relay.frontend.from_pytorch(scripted_model, shape_list)

TRACEBACK ERROR:

Traceback (most recent call last):
  File "", line 22, in <module>
    mod, params = relay.frontend.from_pytorch(scripted_model, shape_list)
  File "./venv/lib/python3.8/site-packages/tvm-0.8.dev996+gb81f3f7a7-py3.8-linux-x86_64.egg/tvm/relay/frontend/pytorch.py", line 3284, in from_pytorch
    ret = converter.convert_operators(_get_operator_nodes(graph.nodes()), outputs, ret_name)[0]
  File "./venv/lib/python3.8/site-packages/tvm-0.8.dev996+gb81f3f7a7-py3.8-linux-x86_64.egg/tvm/relay/frontend/pytorch.py", line 2705, in convert_operators
    relay_out = relay_op(
  File "./venv/lib/python3.8/site-packages/tvm-0.8.dev996+gb81f3f7a7-py3.8-linux-x86_64.egg/tvm/relay/frontend/pytorch.py", line 2057, in deform_conv2d
    return _op.nn.deformable_conv2d(
  File "./venv/lib/python3.8/site-packages/tvm-0.8.dev996+gb81f3f7a7-py3.8-linux-x86_64.egg/tvm/relay/op/nn/nn.py", line 2746, in deformable_conv2d
    return _make.deformable_conv2d(
  File "tvm/_ffi/_cython/./packed_func.pxi", line 322, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 267, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./base.pxi", line 160, in tvm._ffi._cy3.core.CALL
tvm._ffi.base.TVMError: Traceback (most recent call last):
  2: TVMFuncCall
  1: tvm::runtime::TypedPackedFunc<tvm::RelayExpr (tvm::RelayExpr, tvm::RelayExpr, tvm::RelayExpr, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, int, int, int, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::String, tvm::runtime::String, tvm::runtime::String, tvm::runtime::DataType)>::AssignTypedLambda<tvm::relay::{lambda(tvm::RelayExpr, tvm::RelayExpr, tvm::RelayExpr, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, int, int, int, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::String, tvm::runtime::String, tvm::runtime::String, tvm::runtime::DataType)#27}>(tvm::relay::{lambda(tvm::RelayExpr, tvm::RelayExpr, tvm::RelayExpr, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, int, int, int, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::String, tvm::runtime::String, tvm::runtime::String, tvm::runtime::DataType)#27}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const, tvm::runtime::TVMRetValue) const
  0: tvm::runtime::TVMMovableArgValueWithContext_::operator tvm::runtime::Array<tvm::PrimExpr, void><tvm::runtime::Array<tvm::PrimExpr, void> >() const
  3: TVMFuncCall
  2: tvm::runtime::TypedPackedFunc<tvm::RelayExpr (tvm::RelayExpr, tvm::RelayExpr, tvm::RelayExpr, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, int, int, int, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::String, tvm::runtime::String, tvm::runtime::String, tvm::runtime::DataType)>::AssignTypedLambda<tvm::relay::{lambda(tvm::RelayExpr, tvm::RelayExpr, tvm::RelayExpr, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, int, int, int, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::String, tvm::runtime::String, tvm::runtime::String, tvm::runtime::DataType)#27}>(tvm::relay::{lambda(tvm::RelayExpr, tvm::RelayExpr, tvm::RelayExpr, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::Array<tvm::PrimExpr, void>, int, int, int, tvm::runtime::Array<tvm::PrimExpr, void>, tvm::runtime::String, tvm::runtime::String, tvm::runtime::String, tvm::runtime::DataType)#27}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const, tvm::runtime::TVMRetValue) const
  1: tvm::runtime::TVMMovableArgValueWithContext_::operator tvm::runtime::Array<tvm::PrimExpr, void><tvm::runtime::Array<tvm::PrimExpr, void> >() const
  0: tvm::runtime::Array<tvm::PrimExpr, void> tvm::runtime::TVMPODValue_::AsObjectRef<tvm::runtime::Array<tvm::PrimExpr, void> >() const
  File "../include/tvm/runtime/packed_func.h", line 713
TVMError: In function relay.op.nn._make.deformable_conv2d: error while converting argument 3: [16:24:47] ../include/tvm/runtime/packed_func.h:1590: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (!checked_type.defined()) is false: Expected Array[PrimExpr], but got Array[index 0: relay.Call]

cc @yelite

comaniac commented 3 years ago

Looks like your model is dynamic? The argument 3 in deformable conv2d is stride, which is expected to be a static constant expression. However, according to the error message, the strides of deformable conv2d in your model is a call node, which usually means the strides was calculated by another operator on the fly. It would be better if you could try to locate and post a subgraph in the model with such issue to see how the stride was determined.

Also cc @masahi @codeislife99

masahi commented 3 years ago

After converting a pytorch model to torchscript using the tracing method, I can successfully execute it and make inferences.

Does this mean things work if you trace and convert to relay without going through serialization? Note that Torch erases all type information on serialization. This caused problems for quantized model in the past, see https://github.com/pytorch/pytorch/issues/39690. Not sure if this is a related issue.

sacalo commented 3 years ago

Thanks @comaniac I could extract the subgraph of the traced model, after serializing and deserializing it. @masahi: I have tried the traced model before serializing it and it doesn't work neither.

def forward(self,
    x: Tensor,
    argument_2: Tensor) -> Tensor:
  _0 = self.norm
  _1 = self.weight
  out_channels = ops.prim.NumToTensor(torch.size(_1, 0))
  _2 = int(out_channels)
  _3 = ops.prim.NumToTensor(torch.size(x, 0))
  mask = torch.zeros([int(_3), 0], dtype=6, layout=None, device=torch.device("cpu"), pin_memory=False)
  bias = torch.zeros([_2], dtype=6, layout=None, device=torch.device("cpu"), pin_memory=False)
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 2, 2, 1, 1, 1, 1, 32, 1, False)
  return (_0).forward(input, )

graph(%self.1 : __torch__.detectron2.layers.deform_conv.DeformConv,
      %x.1 : Tensor,
      %argument_2.1 : Tensor):
  %25 : bool = prim::Constant[value=0]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:71:0
  %56 : Device = prim::Constant[value="cpu"]()
  %22 : None = prim::Constant() # :0:0
  %8 : int = prim::Constant[value=0]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:66:0
  %21 : int = prim::Constant[value=6]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:71:0
  %39 : int = prim::Constant[value=2]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:92:0
  %40 : int = prim::Constant[value=1]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:92:0
  %41 : int = prim::Constant[value=32]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:92:0
  %4 : __torch__.detectron2.layers.batch_norm.___torch_mangle_35.FrozenBatchNorm2d = prim::GetAttr[name="norm"](%self.1)
  %6 : Tensor = prim::GetAttr[name="weight"](%self.1)
  %9 : int = aten::size(%6, %8) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:66:0
  %out_channels.1 : Tensor = prim::NumToTensor(%9) # :0:0
  %13 : int = aten::Int(%out_channels.1)
  %15 : int = aten::size(%x.1, %8) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:71:0
  %16 : Tensor = prim::NumToTensor(%15) # :0:0
  %19 : int = aten::Int(%16)
  %20 : int[] = prim::ListConstruct(%19, %8)
  %mask.1 : Tensor = aten::zeros(%20, %21, %22, %56, %25) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:71:0
  %28 : int[] = prim::ListConstruct(%13)
  %bias.1 : Tensor = aten::zeros(%28, %21, %22, %56, %25) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:74:0
  %input.1 : Tensor = torchvision::deform_conv2d(%x.1, %6, %argument_2.1, %mask.1, %bias.1, %39, %39, %40, %40, %40, %40, %41, %40, %25) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:92:0
  %46 : Tensor = prim::CallMethod[name="forward"](%4, %input.1) # :0:0
  return (%46)

and these are all the instances where modules calls the "deform_conf2d" function:

model.backbone.bottom_up.res3.0.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 2, 2, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.1.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.2.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.3.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.4.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.5.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.6.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.7.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.0.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 2, 2, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.1.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.2.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.3.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.4.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.5.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.6.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.7.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.8.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.9.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.10.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.11.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.12.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.13.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.14.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.15.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.16.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.17.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.18.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.19.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.20.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.21.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.22.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.23.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.24.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.25.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.26.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.27.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.28.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.29.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.30.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.31.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.32.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.33.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.34.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.35.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res5.0.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 2, 2, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res5.1.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res5.2.conv2
  input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)

Hope it helps

masahi commented 3 years ago

@sacalo Can you send me a repro script and model?

sacalo commented 3 years ago

@sacalo Can you send me a repro script and model?

@masahi I have created a shared folder with the traced model and the scripts to reproduce the issue https://drive.google.com/drive/folders/12ZtdjAVDoRP4OiDQuzL9a84oxsyR9_Yx?usp=sharing

Let me know if you need anything more

masahi commented 3 years ago

ok there is an API change in torchvision deform conv2d between 1.7 and 1.8, and we do not support 1.8.

If you replace https://github.com/apache/tvm/blob/720e7b1ebd9b789a1100dee7536d0633c7941dd1/python/tvm/relay/frontend/pytorch.py#L2071-L2073 with

        strides = (inputs[5], inputs[6])
        padding = (inputs[7], inputs[8])
        dilation = (inputs[9], inputs[10])

the deform conv2d problem should be gone. But I hit a different error from this model.

sacalo commented 3 years ago

thanks for that tip @masahi, I will start researching and trying to solve the problem from there

masahi commented 3 years ago

It's ok, there is probably another bug in the frontend. I'll take a look.

tiandiao123 commented 3 years ago

thanks for that tip @masahi, I will start researching and trying to solve the problem from there

Hi, does it have some progress? I have the same issue here.

tiandiao123 commented 3 years ago

If you guys can check, I found that the deformable conv2d ops of pytorch and the deformable conv2d op of relay api have different result : https://discuss.tvm.apache.org/t/how-to-fix-the-difference-of-deformable-conv2d-op-between-relay-api-and-pytorch-api/10180. probably, that's the reason.

codeislife99 commented 3 years ago

It is possible that relay and PT have different deform conv 2d results. I had put up a PR https://github.com/apache/tvm/pull/7397 which fixed this for PT 1.6 . It's possible they changed something for later versions. My guess is that if something was changed it must be how the outer points are interpolated during the bilinear interpolation step because that's the only place where framework implementations differ as well.

masahi commented 3 years ago

Yeah CI runs PT deformable conv2d test with PT 1.7. It seems PT 1.8 changed something in their deformable conv2d, the same test doesn't seem to work anymore. Even after I apply the fix in https://github.com/apache/tvm/issues/8057#issuecomment-848712212 to workaround the API change, there is a shape mismatch issue.

tiandiao123 commented 3 years ago

It is possible that relay and PT have different deform conv 2d results. I had put up a PR #7397 which fixed this for PT 1.6 . It's possible they changed something for later versions. My guess is that if something was changed it must be how the outer points are interpolated during the bilinear interpolation step because that's the only place where framework implementations differ as well.

OK！I see! Pytorch 1.8 add modulated deformable conv2d in it, probably we need to add it too. I mean deformable conv2d has a mask parameter

apache / tvm

[PyTorch] deformable_conv2 error when converting torch traced model to relay #8057