Open lazycal opened 2 years ago
@hariharans29 @yihonglyu - any comment/feedback is more than welcome.
@lazycal I tried recreating the onnx flow using your code but I'm not getting the output from the Resize block when I look at it in netron. Would really appreciate your help here.
@shgoyal33 Do you mean that you did not get the 2-branch structure like below from my pure ONNX code? That kind of structure is an artifact because of PyTorch's conversion and is not related to this issue. On my end my ONNX code procudes this graph below with the same error:
The onnx model: output.onnx 2.zip
@lazycal Hi, actually the problem which I'm facing is slightly different but your code and the graph helped a lot and the problem is as follows. When I tried to run your code and open the onnx file in netron. The output I'm getting is this.
According to your code and the image of the graph you share you are getting the final shape in Resize Layer but when I go to the properties I'm getting [unknown_dim_0,unknown_dim_1,unknown_dim_2,unknown_dim_3] but in your onnx file the output from the resize block is [1,1,511,1,1] and the graph looks like this.
I want the dimension to be displayed between the Resize and output y. I ran the code which you gave so what else did you change in your code to generate the output dimension from the Resize Block?
@shgoyal33 No I did not do anything else. I guess it's because of the PyTorch version difference. I am using this version: "1.13.0a0+git018d071". I forgot how I installed it. Could be built-from-source. Anyway the other code snippet is pure-ONNX which does not use PyTorch and is able to reproduce the same issue. Maybe you could use that instead.
any update on this issue, I encoutered the same issue when add a resize layer in pytorch:
image = F.interpolate(image.unsqueeze(0), size=(self.resize_width_height[1],
self.resize_width_height[0]),
mode='bilinear',
recompute_scale_factor=False,
align_corners=False)
the result onnx model has issue in inference time. cpu inference is totally ok, gpu inference has 0.5 output ( unless the input size is same as target resize size, in which case, I guess the code just copy the input and avoid the all 0.5 output).
when I change mode='bicubic', everything works well
I guess there is a bug in gpu implementation of Resize layer.
update: change image type from uint8 to float32 by calling image = image.float() can also solve the issue. so apparently, the uint8 image Resize under cuda has some bug.
Describe the bug In a model with Linear layer followed by a trilinear resize like the graph below, the result on GPU is always 0.5 for any inputs, which is different than the result on CPU and than the result from PyTorch, while the latter two equal to each other.
The corresponding torch code:
Maybe related to https://github.com/microsoft/onnxruntime/issues/12019? cc the participants there @diyessi @hariharans29. Though I don't have problems on 4D tensor (i.e., bilinear), and mine is on GPU but that one seems to be on CPU?
After removing thenearest
mode appears to be fine too for me.Linear
node the problem also disappears.Urgency None
System information
To Reproduce Run this code
ONNX model: model.onnx.zip
Expected behavior Generate consistent result.
Screenshots
Additional context None
UPDATE Below I pasted in the code in pure ONNX without using PyTorch, as PyTorch may have bugs in resize related nodes. The issues still remains.
UPDATE 2 Sonearest
mode is also problematic. See this model: [model.onnx.zip]Use this code to reproduce:
UPDATE 3 After looking it closely it seems like a separate issue, reported in https://github.com/microsoft/onnxruntime/issues/12098