lingxuzi commented 10 months ago

cannot export to onnx

HuYue233 commented 10 months ago

I have met similar problems： TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! coords = F.pixel_shuffle(2 * (coords + offset) / normalizer - 1,self.scale).permute(0, 2, 3, 4, 1).contiguous().flatten(0, 1) Traceback (most recent call last): File "UpsamplingTest.py", line 21, in <module> torch.onnx.export(net, input, ROOT+class_name+'.onnx', opset_version=12,input_names=['input'], output_names=['output']) # 输出onnx File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\__init__.py", line 350, in export return utils.export( File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\utils.py", line 163, in export _export( File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\utils.py", line 1074, in _export graph, params_dict, torch_out = _model_to_graph( File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\utils.py", line 731, in _model_to_graph graph = _optimize_graph( File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\utils.py", line 308, in _optimize_graph graph = _C._jit_pass_onnx(graph, operator_export_type) File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\__init__.py", line 416, in _run_symbolic_function return utils._run_symbolic_function(*args, **kwargs) File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\utils.py", line 1406, in _run_symbolic_function return symbolic_fn(g, *inputs, **attrs) File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\symbolic_helper.py", line 234, in wrapper return fn(g, *args, **kwargs) File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\symbolic_opset11.py", line 250, in pixel_shuffle return symbolic_helper._unimplemented("pixel_shuffle", "only support 4d input") File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\symbolic_helper.py", line 450, in _unimplemented _onnx_unsupported(f"{op}, {msg}") File "D:\Anaconda\envs\yolov5\lib\site-packages\torch\onnx\symbolic_helper.py", line 454, in _onnx_unsupported raise RuntimeError( RuntimeError: Unsupported: ONNX export of operator pixel_shuffle, only support 4d input. Please feel free to request support or submit a pull request on PyTorch GitHub. What should I do now？

poppuppy commented 10 months ago

Hi. I haven' t tested it with onnx, but as it said 'ONNX export of operator pixel_shuffle, only support 4d input', I think you can change the shape of 'coords' to 4d just before this line 'coords = F.pixel_shuffle(2 * (coords + offset) / normalizer - 1,self.scale).permute(0, 2, 3, 4, 1).contiguous().flatten(0, 1)' and modify this line according to its shape and then change it back.

poppuppy commented 10 months ago

I just updated the code, and you can have a try to see if it works.

HuYue233 commented 10 months ago

I just updated the code, and you can have a try to see if it works.

哥们儿，在pixel_shuffle那儿好像是没问题了，但是grid_sample这个东西有问题。 onnx导出的时候设置opset_version=12会报错： torch.onnx.symbolic_registry.UnsupportedOperatorError: Exporting the operator ::grid_sampler to ONNX opset version 12 is not supported. Support for this operator was added in version 16, try exporting with this version. 然后我设置成了opset_version=16。现在是能导出来了： DySample onnx

论文中提到的original grid（G）与generated offset（O）相加，但是从netron来看这个结构，我感觉G这边有点问题。并且在进行onnxsim的时候也失败了： onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Failed to load model with error: D:\a\_work\1\s\onnxruntime\core/graph/model_load_utils.h:57 onnxruntime::model_load_utils::ValidateOpsetForDomain ONNX Runtime only *guarantees* support for models stamped with official released onnx opset versions. Opset 16 is under development and support for this is limited. The operator schemas and or other functionality may change before next ONNX release and in this case ONNX Runtime will not guarantee backward compatibility. Current official support for domain ai.onnx is till opset 15. 我怀疑是grid_sample可能支持得不是很好。我使用的代码： x = torch.rand(2, 64, 4, 7) dys = DySample(64) torch.onnx.export(dys, x, 'DySample.onnx', opset_version=16,input_names=['input'], output_names=['output']) net_onnx = onnx.load('DySample.onnx') net_onnx, check = onnxsim.simplify(net_onnx) onnx.save(net_onnx, 'DySample_sim.onnx')

poppuppy commented 10 months ago

'onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Failed to load model with error: D:\a_work\1\s\onnxruntime\core/graph/model_load_utils.h:57' I cannot see anything wrong from this reproted error... And I have never used onnx, maybe it needs further exploration.

poppuppy commented 10 months ago

You can use the 'bilinear_grid_sample' function in mmcv (https://github.com/open-mmlab/mmcv/blob/main/mmcv/ops/point_sample.py), or install mmcv and import it by 'from mmcv.ops.point_sample import bilinear_grid_sample' to replace the F.grid_sample function in DySample.

Zhangyucong0210 commented 8 months ago

You can use the 'bilinear_grid_sample' function in mmcv (https://github.com/open-mmlab/mmcv/blob/main/mmcv/ops/point_sample.py), or install mmcv and import it by 'from mmcv.ops.point_sample import bilinear_grid_sample' to replace the F.grid_sample function in DySample.

Can you speak more clearly？my code is ` def sample(self, x, offset): B, _, H, W = offset.shape offset = offset.view(B, 2, -1, H, W) coords_h = torch.arange(H) + 0.5 coords_w = torch.arange(W) + 0.5 coords = torch.stack(torch.meshgrid([coords_w, coords_h]) ).transpose(1, 2).unsqueeze(1).unsqueeze(0).type(x.dtype).to(x.device) normalizer = torch.tensor([W, H], dtype=x.dtype, device=x.device).view(1, 2, 1, 1, 1) coords = 2 (coords + offset) / normalizer - 1 coords = F.pixel_shuffle(coords.view(B, -1, H, W), self.scale).view( B, 2, -1, self.scale H, self.scale * W).permute(0, 2, 3, 4, 1).contiguous().flatten(0, 1)

return F.grid_sample(x.reshape(B * self.groups, -1, H, W), coords, mode='bilinear',

    #                     align_corners=False, padding_mode="border").reshape((B, -1, self.scale * H, self.scale * W)) #算子不支持
    return bilinear_grid_sample(x.reshape(B * self.groups, -1, H, W), coords,align_corners=False).reshape((B, -1, self.scale * H, self.scale * W))`

but the problem is “Expected condition, x and y to be on the same device, but condition is on cuda:0 and x and y are on cpu and cuda:0 respectively”

Zhangyucong0210 commented 8 months ago

the bilinear_grid_sample which i used is in mmcv (https://github.com/open-mmlab/mmcv/blob/main/mmcv/ops/point_sample.py)

poppuppy commented 8 months ago

Hi, in https://github.com/open-mmlab/mmcv/blob/main/mmcv/ops/point_sample.py, your error indicates in L66, 68, 70, 72, 'torch.tensor(0)' should be just '0'. It is because 'torch.tensor(0)' creates a cpu tensor by default.

tiny-smart / dysample

onnx export #5

return F.grid_sample(x.reshape(B * self.groups, -1, H, W), coords, mode='bilinear',