Open Finniu opened 4 years ago
Hi,
Sorry for the delay in replying.
My advise would be to make sure you convert your inputs and model to CUDA before exporting to ONNX, this is the safest way.
So it would look something like:
model=models.detection.faster_rcnn.fasterrcnn_resnet50_fpn(pretrained=True,min_size=800,max_size=1333)
image=cv2.imread("test.jpg")
image=cv2.resize(image,(1333,800))
image1 = Image.fromarray(cv2.cvtColor(image.copy(),cv2.COLOR_BGR2RGB))
image_tensor=to_tensor(image1)
model.eval()
model.cuda()
image_tensor = image_tensor.cuda()
# just to be safe, run it once to initialize all buffers
out = model([image_tensor])
# now export it
onnx_io = io.BytesIO()
torch.onnx.export(model, [image_tensor], "faster_rcnn.onnx",do_constant_folding=True, opset_version=_onnx_opset_version)
Let me know if you still have issues.
I also get this error with both FasterRCNN and MaskRCNN, and I'm sure that the model and input tensor are on the GPU. I also run the model once before exporting. Exporting with device = 'cpu'
works. It's not specific to ONNX export, the error also appears just by trying to torch.jit.trace
the model.
img = Image.open(sys.argv[1]).convert('RGB')
img = np.array(img)
device = 'cuda'
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True, min_size=800, max_size=800)
model.eval()
model.to(device)
img_ = transforms.ToTensor()(img)
img_ = img_.to(device)
out = model([img_])
torch.onnx.export(model, ([img_],), "/tmp/mask_rcnn.onnx", verbose=True, do_constant_folding=True, opset_version=11)
File "segment_image.py", line 119, in <module>
torch.onnx.export(model, ([img_],), "/tmp/mask_rcnn.onnx", verbose=True, do_constant_folding=True, opset_version=11)
File "/opt/env/lib/python3.7/site-packages/torch/onnx/__init__.py", line 156, in export
custom_opsets)
File "/opt/env/lib/python3.7/site-packages/torch/onnx/utils.py", line 67, in export
custom_opsets=custom_opsets)
File "/opt/env/lib/python3.7/site-packages/torch/onnx/utils.py", line 466, in _export
fixed_batch_size=fixed_batch_size)
File "/opt/env/lib/python3.7/site-packages/torch/onnx/utils.py", line 319, in _model_to_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
File "/opt/env/lib/python3.7/site-packages/torch/onnx/utils.py", line 276, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, _force_outplace=False, _return_inputs_states=True)
File "/opt/env/lib/python3.7/site-packages/torch/jit/__init__.py", line 282, in _get_trace_graph
outs = ONNXTracedModule(f, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/opt/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
result = self.forward(*input, **kwargs)
File "/opt/env/lib/python3.7/site-packages/torch/jit/__init__.py", line 365, in forward
self._force_outplace,
File "/opt/env/lib/python3.7/site-packages/torch/jit/__init__.py", line 352, in wrapper
outs.append(self.inner(*trace_inputs))
File "/opt/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 537, in __call__
result = self._slow_forward(*input, **kwargs)
File "/opt/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 523, in _slow_forward
result = self.forward(*input, **kwargs)
File "/opt/env/lib/python3.7/site-packages/torchvision/models/detection/generalized_rcnn.py", line 70, in forward
proposals, proposal_losses = self.rpn(images, features, targets)
File "/opt/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 537, in __call__
result = self._slow_forward(*input, **kwargs)
File "/opt/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 523, in _slow_forward
result = self.forward(*input, **kwargs)
File "/opt/env/lib/python3.7/site-packages/torchvision/models/detection/rpn.py", line 472, in forward
boxes, scores = self.filter_proposals(proposals, objectness, images.image_sizes, num_anchors_per_level)
File "/opt/env/lib/python3.7/site-packages/torchvision/models/detection/rpn.py", line 379, in filter_proposals
top_n_idx = self._get_top_n_idx(objectness, num_anchors_per_level)
File "/opt/env/lib/python3.7/site-packages/torchvision/models/detection/rpn.py", line 359, in _get_top_n_idx
r.append(top_n_idx + offset)
RuntimeError: expected device cuda:0 but got device cpu
>>> import torchvision; torchvision.__version__
'0.5.0.dev20200108+cu100'
>>> import torch; torch.__version__
'1.5.0.dev20200109+cu100'
@janstrohbeck Thanks for the detailed report!
After digging a bit further, there seems to be a couple of issues. The first one is that torch.onnx.operators.shape_as_tensor
doesn't take the device of the original tensor into account, so that https://github.com/pytorch/vision/blob/61763fa955ef74077a1d3e1aa5da36f7c606943a/torchvision/models/detection/rpn.py#L21
is always a CPU tensor, and the second one is that once we fix the above we also need to fix https://github.com/pytorch/vision/blob/61763fa955ef74077a1d3e1aa5da36f7c606943a/torchvision/models/detection/rpn.py#L24-L26 to use the device of the original tensor.
@lara-hdr do you think we should change shape_as_tensor
in pytorch ONNX to take the device of the original tensor into account as well? Otherwise we can just add casts in the model right away as a workaround solution.
@janstrohbeck @Finniu in the mean time, please convert the model to CPU before exporting to ONNX.
Unsure of whether or not this is coincidental, but I successfully exported the model to ONNX while the model was on the CPU. When serving the ONNX model in a TensorRT server, the model mostly evaluates on the CPU even though the server supposedly loads the model onto the GPU. I know this because, while evaluating, my CPU goes to almost 100% while my GPU utilization remains below 10%.
Could this be related? Without understanding too much about how torch.onnx.export
method works, it's unclear to me whether evaluating the model on the CPU during tracing leads to the ONNX model executing on the CPU.
@nikhilshinday I don't know the answer to your question, maybe @lara-hdr knows it?
@nikhilshinday, torch.onnx.export() does not track if the model was on CPU/GPU when exported, and the exported ONNX model should run on the device you specify regardless of if it was running on CPU/GPU when exported. I am not sure why CPU utilization goes up when you load the ONNX model on GPU; do you know if the engine you are using the run the ONNX model fully supports running it on GPU?
Hi there, any updates?
Although exporting the model in GPU mode fails, exporting it in CPU mode and then loading it into GPU-enabled onnx runtime (using onnxruntime-gpu PyPi package) works just fine. I'm using torch==1.5.0 and torchvision==0.6.0
Although exporting the model in GPU mode fails, exporting it in CPU mode and then loading it into GPU-enabled onnx runtime (using onnxruntime-gpu PyPi package) works just fine. I'm using torch==1.5.0 and torchvision==0.6.0
@raviv Thanks, I will try. Another question is have you tried to convert the onnx model to tensorrt?
@Finniu No, I don't use tensorrt.
BTW, if you want to export maskrcnn_resnet50_fpn so that it accepts any input size, do:
dynamic_axes = {'input': [0, 2, 3], 'output': [0, 2, 3]}
torch.onnx.export(net, ..., dynamic_axes=dynamic_axes)
Exporting Faster R-CNN model:
...
device = torch.device('cuda')
model.to(device)
input = torch.randn((1, 3, 600, 600), device = device)
torch.onnx.export(model, input, 'model.onnx', opset_version = 11)
...
Error:
/usr/local/lib/python3.6/dist-packages/torchvision/models/detection/rpn.py in _get_top_n_idx(self, objectness, num_anchors_per_level)
372 pre_nms_top_n = min(self.pre_nms_top_n(), num_anchors)
373 _, top_n_idx = ob.topk(pre_nms_top_n, dim=1)
--> 374 r.append(top_n_idx + offset)
375 offset += num_anchors
376 return torch.cat(r, dim=1)
RuntimeError: expected device cuda:0 but got device cpu
I exported successfully in CPU mode. GPU not supported?
@danilopeixoto This error has been reported above. My solution was to export using CPU. On inference you can use CPU, GPU, TensorRT, etc. depending on onnx runtime you use. I'm using this one https://microsoft.github.io/onnxruntime/ and very happy with it.
Hello Team,
I am trying to convert the FasterRCNN to ONNX. I was able to successfully export to ONNX, but I am not able to infer any image. I tried to export the model to have a dynamic input size for image as well. Still with no luck.
I am unable to get clear instructions on what should be image data input to the model. I think I am messing up somewhere in the input to the model
Below is the code I am trying to implement to export and infer.
# **This piece of code is implemented from the test_onnx.py file**
def get_image_from_url(url, size=None):
import requests
from PIL import Image
from io import BytesIO
from torchvision import transforms as T
data = requests.get(url)
image = Image.open(BytesIO(data.content)).convert("RGB")
if size is None:
size = (300, 200)
image = image.resize(size, Image.BILINEAR)
# plt.imshow(image)
# img = Image.open(img_path) # Load the image
transform = T.Compose([T.ToTensor()]) # Defing PyTorch Transform
image = transform(image)
# to_tensor = transforms.ToTensor()
return image
def get_test_images():
image_url = "http://farm3.staticflickr.com/2469/3915380994_2e611b1779_z.jpg"
image = get_image_from_url(url=image_url, size=(100, 320))
image_url2 = "https://pytorch.org/tutorials/_static/img/tv_tutorial/tv_image05.png"
image2 = get_image_from_url(url=image_url2, size=(250, 380))
images = image
test_images = [image2]
return images, test_images
images, test_images = get_test_images()
dummy_input = torch.randn(1, 3, 224, 224)
model_name = r"fasterrcnn_resnet50_fpn_dynamic_try4_with_image_input"
final_path = model_name + ".onnx"
dynamic_axes = {'input': [0, 2, 3], 'output': [0, 2, 3]}
torch.onnx.export(model, images.unsqueeze(0),final_path ,
do_constant_folding=True, opset_version=11,
dynamic_axes=dynamic_axes, input_names=['input'], output_names=['output'])
Below is the code I am using to use the onnx model to infer.
folder = my path
model_name = r"fasterrcnn_resnet50_fpn_dynamic_try4_with_image_input.onnx"
final_path = os.path.join(folder,model_name)
# Load the ONNX model
model_onnx = onnx.load(final_path)
# Check that the IR is well formed
onnx.checker.check_model(model_onnx)
# Print a human readable representation of the graph
print(onnx.helper.printable_graph(model_onnx.graph))
import onnxruntime as rt
sess = rt.InferenceSession(final_path)
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
pred = sess.run([output_name], {input_name:images})
This crashes my kernel and restarts it. I don't know why? I think I am messing up with the input type and dimensions.
Could you please help me to get this onnx model up and running!!
Attached is the graph log for the converted model. Let me know if I am missing somewhere in the conversion procedure as well. torchvision_frcnn_try4_dynamic_onnx_log.docx
Thanks a lot!!
Torchvision Faster R-CNN model does not support dynamic input shape according to documentation.
Faster R-CNN is exportable to ONNX for a fixed batch size with inputs images of fixed size.
@danilopeixoto Dynamic shape support should now work on ONNX if using a very recently torchvision nightly
Hey @danilopeixoto, @fmassa Thank you for the suggestions. But I am still not able to get any output either for a fix or dynamic input image.
Could you please have a look into the code let me know where am I going wrong?
Also, I tried to tun the test_faster_rcnn from the latest test_onnx.py
(here) file and I got the following error.
As I am a newbie here, I don't exactly know what this error means:
log:
>>> test_object = test_onnx.ONNXExporterTester()
>>> test_object.test_faster_rcnn()
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torch\nn\functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details.
warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
..\torch\csrc\utils\python_arg_parser.cpp:756: UserWarning: This overload of nonzero is deprecated:
nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input, *, bool as_tuple)
..\aten\src\ATen\native\BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torchvision\models\detection\rpn.py:164: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
torch.tensor(image_size[1] / g[1], dtype=torch.int64, device=device)] for g in grid_sizes]
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torch\tensor.py:467: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
'incorrect results).', category=RuntimeWarning)
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torchvision\ops\boxes.py:117: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
boxes_x = torch.min(boxes_x, torch.tensor(width, dtype=boxes.dtype, device=boxes.device))
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torchvision\ops\boxes.py:119: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
boxes_y = torch.min(boxes_y, torch.tensor(height, dtype=boxes.dtype, device=boxes.device))
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torchvision\models\detection\transform.py:217: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
for s, s_orig in zip(new_size, original_size)
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torch\onnx\symbolic_opset9.py:2115: UserWarning: Exporting aten::index operator of advanced indexing in opset 11 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.
"If indices include negative values, the exported graph will produce incorrect results.")
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torch\onnx\utils.py:915: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input images_tensors
'Automatically generated names will be applied to each dynamic axes of input {}'.format(key))
C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torch\onnx\utils.py:915: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input outputs
'Automatically generated names will be applied to each dynamic axes of input {}'.format(key))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\test_onnx.py", line 357, in test_faster_rcnn
tolerate_small_mismatch=True)
File "C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\test_onnx.py", line 49, in run_model
self.ort_validate(onnx_io, test_inputs, test_ouputs, tolerate_small_mismatch)
File "C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\test_onnx.py", line 71, in ort_validate
torch.testing.assert_allclose(outputs[i], ort_outs[i], rtol=1e-03, atol=1e-05)
File "C:\Users\msjmf59\Documents\VirtualEnvironments\pytorch_gpu2\Lib\site-packages\torch\testing\__init__.py", line 24, in assert_allclose
expected = expected.expand_as(actual)
RuntimeError: The expanded size of the tensor (52) must match the existing size (54) at non-singleton dimension 0. Target sizes: [52, 4]. Tensor sizes: [54, 4]
Thanks a lot!
cc @neginraoof if you could have a look
cc @neginraoof if you could have a look
Hi @fmassa, is there any tool that I can convert the faster rcnn onnx to tensorrt? I have tried with onnx-tensorrt, but failed in nms converting. Thanks
Hi. I am also experiencing the original RuntimeError: expected device cuda:0 but got device cpu
message when JIT tracing any rcnn model (on the same line as OP). Strangely, the first iteration through for ob in objectness.split(num_anchors_per_level, 1)
succeeds, but the second one fails.
I am working on a JIT related project that requires the model to be traced on the gpu, so the export on cpu workaround does not apply to me. I am not concerned with ONNX right now, only Torchscript. Is there a timeline on a fix for this? Even guidance on how to fix this myself would be appreciated. @fmassa
Hi,
I was able to export the model to ONNX and it was working fine, but now only empty detections are returned. I've tried to downgrade the package versions, change the opset, check the code for changes.
Model inference using PyTorch still works.
Has anyone experienced this issue exporting Faster RCNN model to ONNX?
I replace the dummy input:
input_data = [torch.rand((3, 600, 600), device = cpu_device)]
with:
input_data = [torch.randn((3, 600, 600), device = cpu_device)]
and it worked.
This issue may be related to Export object detection model to ONNX:empty output by ONNX inference.
I replace the dummy input:
input_data = [torch.rand((3, 600, 600), device = cpu_device)]
with:
input_data = [torch.randn((3, 600, 600), device = cpu_device)]
and it worked.
This issue may be related to Export object detection model to ONNX:empty output by ONNX inference.
Hi @danilopeixoto, do I get this right that your only change was from rand
to randn
?
I am experiencing the same issue.
@FraPochetti Yes, that was the only change in the code.
@danilopeixoto Thanks! Do you happen to have the code snippet you used by any chance? (as you probably did yourself) I tried a ton of things and nothing is really working, yours included, unfortunately. Maybe I am doing something really silly somewhere else.
I replace the dummy input:
input_data = [torch.rand((3, 600, 600), device = cpu_device)]
with:
input_data = [torch.randn((3, 600, 600), device = cpu_device)]
and it worked.
This issue may be related to Export object detection model to ONNX:empty output by ONNX inference.
Thanks @danilopeixoto it worked! Pushing model and dummy input into cpu and converting it into onnx solved my problem. Later was able to do inference on gpu with onnxruntime 'CUDAExecutionProvider'. 👍
Hi @Ratansairohith great you found a way!
Would you be so kind to share the entire code snippet you used to get mask_rcnn
to work?
Reading your comment, I just retried on my own but I don't seem to get it right.
This is the Colab notebook I have put together, for reference.
Hey @FraPochetti! Yeah my code was pretty straight forward. I trained my own faster rcnn with custom dataset just llike official pytorch tutorial https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html. Then i used the following code snippet to convert my torch model to onnx.
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.models.detection import FasterRCNN
from torchvision.models.detection.rpn import AnchorGenerator
import torch
num_classes = 10
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
model.load_state_dict(torch.load('/content/drive/MyDrive/images/fasterrcnn_resnet50_fpn_9.pth'))
model.eval()
# set device to cpu
cpu_device = torch.device('cpu')
x = [torch.randn((3, 384, 384), device = cpu_device)]
model.to(cpu_device)
# finally convert pytorch model to onnx
torch.onnx.export(model, x , "faster_rcnn_9.onnx", verbose=True, do_constant_folding=True, opset_version=11)
Thanks a lot for the reply.
I see you are using faster_rcnn
. That worked for me too.
mask_rcnn
is the one which is causing me trouble :)
Hello @fmassa, @lara-hdr.
Thanks you for your work. I was wondering if there a way of exporting faster rcnn model without transformation layers and make 2 static output tensors (boxes and scores)? Any direction of digging would be appreciated.
Thank you kindly.
Hi there, I tried to convert a fasterrcnn model to onnx format, and followed the instruction from test/test_onnx.py https://github.com/pytorch/vision/blob/master/test/test_onnx.py.
Here is my code:
model=models.detection.faster_rcnn.fasterrcnn_resnet50_fpn(pretrained=True,min_size=800,max_size=1333)
image=cv2.imread("test.jpg")
image=cv2.resize(image,(1333,800))
image1 = Image.fromarray(cv2.cvtColor(image.copy(),cv2.COLOR_BGR2RGB))
image_tensor=to_tensor(image1)
model.eval()
onnx_io = io.BytesIO()
torch.onnx.export(model, [image_tensor], "faster_rcnn.onnx",do_constant_folding=True, opset_version=_onnx_opset_version)
I have succeed convert the model with the above code, however, when I tried to convert the tensor and model to cuda tensor with
.to(device)
, there is an error that isline 359, in _get_top_n_idx r.append(top_n_idx + offset) RuntimeError: expected device cuda:0 but got device cpu
. I don't know how to solve it.Please help me with that.
Cheers!