Closed sacalo closed 1 month ago
Looks like your model is dynamic? The argument 3 in deformable conv2d is stride, which is expected to be a static constant expression. However, according to the error message, the strides of deformable conv2d in your model is a call node, which usually means the strides was calculated by another operator on the fly. It would be better if you could try to locate and post a subgraph in the model with such issue to see how the stride was determined.
Also cc @masahi @codeislife99
After converting a pytorch model to torchscript using the tracing method, I can successfully execute it and make inferences.
Does this mean things work if you trace and convert to relay without going through serialization? Note that Torch erases all type information on serialization. This caused problems for quantized model in the past, see https://github.com/pytorch/pytorch/issues/39690. Not sure if this is a related issue.
Thanks @comaniac I could extract the subgraph of the traced model, after serializing and deserializing it. @masahi: I have tried the traced model before serializing it and it doesn't work neither.
def forward(self,
x: Tensor,
argument_2: Tensor) -> Tensor:
_0 = self.norm
_1 = self.weight
out_channels = ops.prim.NumToTensor(torch.size(_1, 0))
_2 = int(out_channels)
_3 = ops.prim.NumToTensor(torch.size(x, 0))
mask = torch.zeros([int(_3), 0], dtype=6, layout=None, device=torch.device("cpu"), pin_memory=False)
bias = torch.zeros([_2], dtype=6, layout=None, device=torch.device("cpu"), pin_memory=False)
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 2, 2, 1, 1, 1, 1, 32, 1, False)
return (_0).forward(input, )
graph(%self.1 : __torch__.detectron2.layers.deform_conv.DeformConv,
%x.1 : Tensor,
%argument_2.1 : Tensor):
%25 : bool = prim::Constant[value=0]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:71:0
%56 : Device = prim::Constant[value="cpu"]()
%22 : None = prim::Constant() # :0:0
%8 : int = prim::Constant[value=0]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:66:0
%21 : int = prim::Constant[value=6]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:71:0
%39 : int = prim::Constant[value=2]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:92:0
%40 : int = prim::Constant[value=1]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:92:0
%41 : int = prim::Constant[value=32]() # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:92:0
%4 : __torch__.detectron2.layers.batch_norm.___torch_mangle_35.FrozenBatchNorm2d = prim::GetAttr[name="norm"](%self.1)
%6 : Tensor = prim::GetAttr[name="weight"](%self.1)
%9 : int = aten::size(%6, %8) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:66:0
%out_channels.1 : Tensor = prim::NumToTensor(%9) # :0:0
%13 : int = aten::Int(%out_channels.1)
%15 : int = aten::size(%x.1, %8) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:71:0
%16 : Tensor = prim::NumToTensor(%15) # :0:0
%19 : int = aten::Int(%16)
%20 : int[] = prim::ListConstruct(%19, %8)
%mask.1 : Tensor = aten::zeros(%20, %21, %22, %56, %25) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:71:0
%28 : int[] = prim::ListConstruct(%13)
%bias.1 : Tensor = aten::zeros(%28, %21, %22, %56, %25) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:74:0
%input.1 : Tensor = torchvision::deform_conv2d(%x.1, %6, %argument_2.1, %mask.1, %bias.1, %39, %39, %40, %40, %40, %40, %41, %40, %25) # ./venv3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py:92:0
%46 : Tensor = prim::CallMethod[name="forward"](%4, %input.1) # :0:0
return (%46)
and these are all the instances where modules calls the "deform_conf2d" function:
model.backbone.bottom_up.res3.0.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 2, 2, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.1.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.2.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.3.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.4.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.5.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.6.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res3.7.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.0.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 2, 2, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.1.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.2.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.3.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.4.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.5.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.6.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.7.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.8.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.9.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.10.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.11.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.12.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.13.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.14.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.15.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.16.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.17.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.18.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.19.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.20.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.21.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.22.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.23.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.24.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.25.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.26.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.27.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.28.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.29.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.30.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.31.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.32.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.33.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.34.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res4.35.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res5.0.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 2, 2, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res5.1.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
model.backbone.bottom_up.res5.2.conv2
input = ops.torchvision.deform_conv2d(x, _1, argument_2, mask, bias, 1, 1, 1, 1, 1, 1, 32, 1, False)
Hope it helps
@sacalo Can you send me a repro script and model?
@sacalo Can you send me a repro script and model?
@masahi I have created a shared folder with the traced model and the scripts to reproduce the issue https://drive.google.com/drive/folders/12ZtdjAVDoRP4OiDQuzL9a84oxsyR9_Yx?usp=sharing
Let me know if you need anything more
ok there is an API change in torchvision deform conv2d between 1.7 and 1.8, and we do not support 1.8.
If you replace https://github.com/apache/tvm/blob/720e7b1ebd9b789a1100dee7536d0633c7941dd1/python/tvm/relay/frontend/pytorch.py#L2071-L2073 with
strides = (inputs[5], inputs[6])
padding = (inputs[7], inputs[8])
dilation = (inputs[9], inputs[10])
the deform conv2d problem should be gone. But I hit a different error from this model.
thanks for that tip @masahi, I will start researching and trying to solve the problem from there
It's ok, there is probably another bug in the frontend. I'll take a look.
thanks for that tip @masahi, I will start researching and trying to solve the problem from there
Hi, does it have some progress? I have the same issue here.
If you guys can check, I found that the deformable conv2d ops of pytorch and the deformable conv2d op of relay api have different result : https://discuss.tvm.apache.org/t/how-to-fix-the-difference-of-deformable-conv2d-op-between-relay-api-and-pytorch-api/10180. probably, that's the reason.
It is possible that relay and PT have different deform conv 2d results. I had put up a PR https://github.com/apache/tvm/pull/7397 which fixed this for PT 1.6 . It's possible they changed something for later versions. My guess is that if something was changed it must be how the outer points are interpolated during the bilinear interpolation step because that's the only place where framework implementations differ as well.
Yeah CI runs PT deformable conv2d test with PT 1.7. It seems PT 1.8 changed something in their deformable conv2d, the same test doesn't seem to work anymore. Even after I apply the fix in https://github.com/apache/tvm/issues/8057#issuecomment-848712212 to workaround the API change, there is a shape mismatch issue.
It is possible that relay and PT have different deform conv 2d results. I had put up a PR #7397 which fixed this for PT 1.6 . It's possible they changed something for later versions. My guess is that if something was changed it must be how the outer points are interpolated during the bilinear interpolation step because that's the only place where framework implementations differ as well.
OK!I see! Pytorch 1.8 add modulated deformable conv2d in it, probably we need to add it too. I mean deformable conv2d has a mask parameter
After converting a pytorch model to torchscript using the tracing method, I can successfully execute it and make inferences. But when trying to convert the traced model following this code, it fails with the attached traceback error. I can see that is has to do with the "deformable_conv2d" but I'm not able to follow the cause deeper.
using the scripted_model (works OK):
converting scripted_model to relay (fails with the error posted):
TRACEBACK ERROR:
cc @yelite