Open glenn-jocher opened 4 years ago
I posted an issue on the pytorch github page too. They are working on a fix.
Here is the issue https://github.com/pytorch/pytorch/issues/45816. Looks like a PR is imminent.
@glenn-jocher Is there a reason why the input size is fixed when doing an ONNX export if the anchors and grid offsets aren't applied? I can understand either:
In other words i would expect either:
This is what I do for yolov3 models defined in pytorch and it works a dream.
@glenn-jocher Hi, I try to convert the onnx to tensorRT, but it complaines No importer registered for op: ScatterND. Would you tell me where is the scatterND op in yolo and how to replace with some op that tensorRT support?
@glenn-jocher Is there a reason why the input size is fixed when doing an ONNX export if the anchors and grid offsets aren't applied? I can understand either:
* fixed input size + apply anchors + apply grid offsets + permute dimensions + concat dimensions OR * dynamic input size + output conv layers immediately preceding yolo layers
In other words i would expect either:
* input dim == [1,3,640,640] output dim == [1,25200,85] OR * input dim == [1,3, height, width] output dims == [1,3,height / 32, width / 32,85] , [1,3, height / 16, width / 16,85] , [1,3,height / 8, width / 8,85]
This is what I do for yolov3 models defined in pytorch and it works a dream.
I changed the export line to:
torch.onnx.export(model, img, f, verbose=False, export_params=True, opset_version=12,
input_names=['img'],
output_names=['out1', 'out2', 'out3'],
dynamic_axes={'img': [0,2,3], 'out1': [0,2,3], 'out2': [0,2,3], 'out3': [0,2,3]})
@glenn-jocher What was the motivation behind this:
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
?? That's very different to yolov3 and yolov4
@Ezra-Yu yes that is correct. You are free to set it to False if that suits you better.
But setting this to False throws error for coreml conversion. @dlawrences can you help me out regarding this?
@dlawrences I get the same empty error as well. I changed nothing other than setting export to False.
model.model[-1].export = False
Printed the traceback of the error:
Converting Frontend ==> MIL Ops: 89%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 970/1084 [00:00<00:00, 1678.45 ops/s]
CoreML export failure:
Traceback (most recent call last):
File "models/export.py", line 86, in <module>
model = ct.convert(ts, inputs=[ct.ImageType(name='image', shape=img.shape, scale=1 / 255.0, bias=[0, 0, 0])])
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 176, in convert
mlmodel = mil_convert(
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 128, in mil_convert
proto = mil_convert_to_proto(model, convert_from, convert_to,
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 171, in mil_convert_to_proto
prog = frontend_converter(model, **kwargs)
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 85, in __call__
return load(*args, **kwargs)
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 85, in load
raise e
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 75, in load
prog = converter.convert()
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 224, in convert
convert_nodes(self.context, self.graph)
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 56, in convert_nodes
_add_op(context, node)
File "/home/dogus/environments/work3.8/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 1612, in select
assert _input.val is None
AssertionError
Ubuntu 18.04
Cuda 10.0
I installed PyTorch for GPU
torch==1.6.0+cu101 torchvision==0.7.0+cu101
Installed packages in the Python3.8.0 environment that I use:
attr==0.3.1
attrs==20.2.0
certifi==2020.6.20
coremltools==4.0
cycler==0.10.0
Cython==0.29.21
future==0.18.2
joblib==0.17.0
kiwisolver==1.2.0
matplotlib==3.3.2
mpmath==1.1.0
numpy==1.19.2
onnx==1.7.0
opencv-python==4.4.0.44
packaging==20.4
Pillow==8.0.1
pkg-resources==0.0.0
protobuf==3.13.0
pyparsing==2.4.7
python-dateutil==2.8.1
PyYAML==5.3.1
scikit-learn==0.23.2
scipy==1.5.3
six==1.15.0
sympy==1.6.2
threadpoolctl==2.1.0
torch==1.6.0+cu101
torchvision==0.7.0+cu101
tqdm==4.51.0
typing-extensions==3.7.4.3
There are many warnings when convert to onnx:
Warning: ATen was a removed experimental ops. In the future, we may directly reject this operator. Please update your model as soon as possible.
but the ONNX export success.
However, when convert onnx to openvino. this warning become error:
[ ERROR ] Cannot infer shapes or values for node "ATen_470".
[ ERROR ] There is no registered "infer" function for node "ATen_470" with op = "ATen". Please implement this function in the extensions.
For more information please refer to Model Optimizer FAQ (https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html), question #37.
[ ERROR ]
[ ERROR ] It can happen due to bug in custom shape infer function .
[ ERROR ] Or because the node inputs have incorrect values/shapes.
[ ERROR ] Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ] Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ] Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "ATen_470" node.
For more information please refer to Model Optimizer FAQ (https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html), question #38.
Can anyone help me? Thank you very much!
set self.training = False
in Detector before you export the model, then U can get the same output with original model.
You can work around it by manually editing the
yolo.py
file inside the exported torchscript archive
Same here, how to modify the torchscript
file? since it's a binary file.
You can work around it by manually editing the
yolo.py
file inside the exported torchscript archiveSame here, how to modify the
torchscript
file? since it's a binary file.
torchscript file is just a zip file, open it with 7-zip or winrar
@waicool20
Just export the torchscript with map_location=torch.device('cuda')
will solve the problem:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
But this would have another problem: the torchscript can only run on cuda device but not on CPU ONLY case. Maybe I should figure out why there are constant value
while tracing the model. Or the torchscript could make it compatible for devices.๐ป
i change code to export dynmaic batching. but i can't.
dynamic_axes={'images': {0: 'batch_size'}, 'output': {0: 'batch_size'}, '781': {0: 'batch_size'}, '801': {0: 'batch_size'}}
f = opt.weights.replace('.pt', '.onnx') # filename
# torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=['images'],
# output_names=['classes', 'boxes'], dynamic_axes=dynamic_axes if y is None else ['output'])
torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=['images'],
output_names=['output'], dynamic_axes=dynamic_axes)
can anyone help me to get batching model?
Hi @HoangTienDuc , I add a dynmaic batching inference in the notebooks.
@zhiqwang thanks. i got it.
If you are getting an error such as this during inference (device mismatch)
RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/__torch__/models/yolo.py", line 47, in <fused code> _38 = (_5).forward(_37, ) _39 = (_3).forward((_4).forward(_37, ), _30, ) _40 = (_0).forward((_1).forward((_2).forward(_39, ), ), _38, _35, ) ~~~~~~~~~~~ <--- HERE _41, _42, _43, _44, = _40 return (_44, [_41, _42, _43]) File "code/__torch__/models/yolo.py", line 73, in forward _52 = torch.sub(_51, CONSTANTS.c2, alpha=1) _53 = torch.to(CONSTANTS.c3, dtype=6, layout=0, device=torch.device("cpu"), pin_memory=False, non_blocking=False, copy=False, memory_format=None) _54 = torch.mul(torch.add(_52, _53, alpha=1), torch.select(CONSTANTS.c4, 0, 0)) ~~~~~~~~~ <--- HERE _55 = torch.slice(y, 4, 0, 2, 1) _56 = torch.expand(torch.view(_54, [3, 20, 20, 2]), [1, 3, 20, 20, 2], implicit=True) Traceback of TorchScript, original code (most recent call last): C:\Users\waicool20\Programming\python\yolov5\models\yolo.py(34): forward C:\Python38\lib\site-packages\torch\nn\modules\module.py(534): _slow_forward C:\Python38\lib\site-packages\torch\nn\modules\module.py(548): __call__ C:\Users\waicool20\Programming\python\yolov5\models\yolo.py(117): forward_once C:\Users\waicool20\Programming\python\yolov5\models\yolo.py(97): forward C:\Python38\lib\site-packages\torch\nn\modules\module.py(534): _slow_forward C:\Python38\lib\site-packages\torch\nn\modules\module.py(548): __call__ C:\Python38\lib\site-packages\torch\jit\__init__.py(1027): trace_module C:\Python38\lib\site-packages\torch\jit\__init__.py(873): trace ./models/export.py(35): <module> RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
You can work around it by manually editing the
yolo.py
file inside the exported torchscript archivedef forward(self: __torch__.models.yolo.Detect, argument_1: Tensor, argument_2: Tensor, argument_3: Tensor) -> Tuple[Tensor, Tensor, Tensor, Tensor]: dev = argument_1.device <--- Add this line _45 = self.anchor_grid bs = ops.prim.NumToTensor(torch.size(argument_1, 0))
Then replace all the references of
torch.device("cpu")
todev
Not sure if there's a better way to export it so it does something like this by default :/
This worked perfectly!
The torchtraced model has the same issue with model = model.half()
as some constants remain in FP32. Have you tried solving it?
Exporting the model in FP16 its not possible due to some constants remain in FP32
i just change the code like this: model = attempt_load(opt.weights, map_location=torch.device('cuda:0')) # load FP32 model img = torch.zeros(opt.batch_size, 3, *opt.img_size).to(device='cuda:0') then run: python models/export.py --weights ./weights/yolov5s.pt --img 640 --batch 1 and the error is: RuntimeError: CUDA error: out of memory
could you help me ?
@ waicool20 ๅช้ๅฏผๅบtorchscriptๅณๅฏ
map_location=torch.device('cuda')
่งฃๅณ้ฎ้ข๏ผRuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
ไฝ่ฟไผๅธฆๆฅๅฆไธไธช้ฎ้ข๏ผ็ซ็ฌ่ๆฌๅช่ฝๅจcuda่ฎพๅคไธ่ฟ่ก๏ผ่ไธ่ฝไป ๅจCPUไธ่ฟ่กใไน่ฎธๆๅบ่ฏฅๆพๅบๅๅ ๏ผ
constant value
ๅๆถ่ฟฝ่ธชๆจกๅใๆ่ ็ซ็ฌ่ๆฌๅฏไปฅไฝฟๅ ถไธ่ฎพๅคๅ ผๅฎนใ๐ป When I export the GPU model, the following error will be reported. May I ask why๏ผ (pytorch1_6) F:\Pytorch_Project\yolov5_11_27\yolov5-master>python models/export.py --weights ./weights/yolov5s.pt --img 640 --batch 1 Namespace(batch_size=1, img_size=[640, 640], weights='./weights/yolov5s.pt') Traceback (most recent call last): File "models/export.py", line 37, inmodel = attempt_load(opt.weights, map_location=torch.device('cuda:0')) # load FP32 model File ".\models\experimental.py", line 137, in attempt_load model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model File "D:\Anaconda3\envs\pytorch1_6\lib\site-packages\torch\serialization.py", line 584, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "D:\Anaconda3\envs\pytorch1_6\lib\site-packages\torch\serialization.py", line 842, in _load result = unpickler.load() File "D:\Anaconda3\envs\pytorch1_6\lib\site-packages\torch\serialization.py", line 834, in persistent_load load_tensor(data_type, size, key, _maybe_decode_ascii(location)) File "D:\Anaconda3\envs\pytorch1_6\lib\site-packages\torch\serialization.py", line 823, in load_tensor loaded_storages[key] = restore_location(storage, location) File "D:\Anaconda3\envs\pytorch1_6\lib\site-packages\torch\serialization.py", line 803, in restore_location return default_restore_location(storage, str(map_location)) File "D:\Anaconda3\envs\pytorch1_6\lib\site-packages\torch\serialization.py", line 174, in default_restore_location result = fn(storage, location) File "D:\Anaconda3\envs\pytorch1_6\lib\site-packages\torch\serialization.py", line 156, in _cuda_deserialize return obj.cuda(device) File "D:\Anaconda3\envs\pytorch1_6\lib\site-packages\torch_utils.py", line 77, in _cuda return newtype(self.size()).copy(self, non_blocking) RuntimeError: CUDA error: out of memory
I run export.py
for export .pt to torchscript, but i got this error:
Namespace(batch_size=1, img_size=[640, 640], weights='weights/yolov5s.pt')
Fusing layers...
Model Summary: 232 layers, 7459581 parameters, 0 gradients
Starting TorchScript export with torch 1.7.0...
./models/yolo.py:53: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
/home/yasin/yolo/lib/python3.8/site-packages/torch/jit/_trace.py:934: TracerWarning: Encountering a list at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. for `list`, use a `tuple` instead. for `dict`, use a `NamedTuple` instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.
module._c._create_method_from_trace(
TorchScript export success, saved as weights/yolov5s.torchscript.pt
and using this saved torchscript on my c++ program using libtorch, produce some error:
terminate called after throwing an instance of 'std::runtime_error'
what(): The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/__torch__/models/yolo.py", line 45, in forward
_35 = (_4).forward(_34, )
_36 = (_2).forward((_3).forward(_35, ), _29, )
_37 = (_0).forward(_33, _35, (_1).forward(_36, ), )
~~~~~~~~~~~ <--- HERE
_38, _39, _40, _41, = _37
return (_41, [_38, _39, _40])
File "code/__torch__/models/yolo.py", line 75, in forward
_52 = torch.sub(_51, CONSTANTS.c3, alpha=1)
_53 = torch.to(CONSTANTS.c4, dtype=6, layout=0, device=torch.device("cpu"), pin_memory=None, non_blocking=False, copy=False, memory_format=None)
_54 = torch.mul(torch.add(_52, _53, alpha=1), torch.select(CONSTANTS.c5, 0, 0))
~~~~~~~~~ <--- HERE
_55 = torch.slice(y, 4, 0, 2, 1)
_56 = torch.expand(torch.view(_54, [3, 80, 80, 2]), [1, 3, 80, 80, 2], implicit=True)
Traceback of TorchScript, original code (most recent call last):
./models/yolo.py(57): forward
/home/yasin/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
/home/yasin/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
./models/yolo.py(137): forward_once
./models/yolo.py(121): forward
/home/yasin/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
/home/yasin/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
/home/yasin/yolo/lib/python3.8/site-packages/torch/jit/_trace.py(934): trace_module
/home/yasin/yolo/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
models/export.py(57): <module>
RuntimeError: The size of tensor a (48) must match the size of tensor b (80) at non-singleton dimension 2
anyone can help me?
i already export yolov5l model to onnx dynamic batching. But when i run my onnx model, it doesnot use gpu, it only use cpu. Can any one help me to solve this problem?
I have got some error with this export.py
script.
Please Guide me.
https://github.com/ultralytics/yolov5/issues/1554#issue-753020980
i already export yolov5l model to onnx dynamic batching. But when i run my onnx model, it doesnot use gpu, it only use cpu. Can any one help me to solve this problem?
Use onnxruntime-gpu maybe?
@MarcoCBA i have tried onnxruntime-gpu many time. I think it is not easy. see #1559
i already export yolov5l model to onnx dynamic batching. But when i run my onnx model, it doesnot use gpu, it only use cpu. Can any one help me to solve this problem?
Use onnxruntime-gpu maybe?
I get an error both in Ubuntu and OS X when exporting: CoreML export failure: unexpected number of inputs for node x.2 (_convolution): 13
What part of code should I modify to make CoreML export work?
@nobody-cheng, @luvwinnie , dynamic batch size is working for you?. I tried doing the same.I used batch_size=16 while exporting, trying to infer on batch_size=32. But I am getting the following error at the time of inference.
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'Reshape_931' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:43 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, std::vector<long int>&) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{32,3,64,64,2}, requested shape:{16,3,64,64,2}
for me personally is work
def inference(devices, img_paths, index, batchsize): os.environ['CUDA_VISIBLE_DEVICES'] = str(devices) features = list() ort_sess = onnxruntime.InferenceSession(args.model_path) input_name = ort_sess.get_inputs()[0].name images = list() for img_path in img_paths: image = preprocess(img_path, args.height, args.width) if len(images) < batchsize: images.append(image) continue else: intput_image = np.concatenate(images) feat = ort_sess.run(None, {input_name: intput_image})[0] feat = normalize(feat, axis=1) for i in feat: features.append(i) images.clear()
That's really a FAKE batch inference, your model can only accept specific batch, if the batch is 16
, but you have only 1
image, you would have to append
15 images to the input_tensor, which is not necessary.
One method to make yolov5
support dynamic batch inference is throught the autoshape
function, with some modification it can also be exported to ONNX.
@agorskih any updates? I am facing a similar error.
I am getting this error while converting into ONNX coremltools = 4.0 ONNX = 1.7.0
`Adding op '178' of type const Converting op 179 : listconstruct Adding op '179' of type const Converting op x.2 : _convolution Converting Frontend ==> MIL Ops: 2%|โ | 23/932 [00:00<00:00, 983.84 ops/s] CoreML export failure: unexpected number of inputs for node x.2 (_convolution): 13
Export complete (9.95s). Visualize with https://github.com/lutzroeder/netron. (yolo) arslan@MacBook-Pro yolov5 % `
@hfzarslan Please use torch 1.60, it's work pip uninstall torch pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
Does this still work with Yolo v3 tiny?
I get an error both in Ubuntu and OS X when exporting:
CoreML export failure: unexpected number of inputs for node x.2 (_convolution): 13
What part of code should I modify to make CoreML export work?
there maybe a problem with your version of torch
onnx pass ๏ผbut onnx-sim Segment error
Does anyone try to support variable size for torchscript model? Currently trying to make the output variable input and output for nvidia's triton inference server torchscript model usage.
I see the code seems like it use the torch.jit.trace
to trace the outputs. It seems like we can use the torch.jit.script
for variable input/outputs for torchscript model, however it shows the below errors. Does anyone can help?
TorchScript export failure:
Tried to access nonexistent attribute or method '__add__' of type '__torch__.utils.activations.Hardswish'. Did you forget to initialize an attribute in __init__()?:
File "./utils/activations.py", line 17
def forward(x):
# return x * F.hardsigmoid(x) # for torchscript and CoreML
return x * F.hardtanh(x + 3, 0., 6.) / 6. # for torchscript, CoreML and ONNX
~~~~~ <--- HERE
Hi @luvwinnie , I've supported dynamic batch inference with torchscript (torch.jit.script
) and onnxruntime in my own repo, maybe you could refer to this.
@zhiqwang Thank you so much! i would like to test with your repo! On Triton Inference Server(TRT Server), however I'm facing a fixed size model problem with TRT Server, I have open an issue on triton-inference-server github, so that we can solve the deployment problem for yolov5!
Hi @luvwinnie , I didn't test it on TRT Server. My yolov5rt follows the structure of torchvision's faster-rcnn and retinanet, if you could deploy torchvision's models successfully, yolov5rt should also be done. And I will trace the issue you mention and check anything I could do here.
@zhiqwang Thank you so much. Currently I'm trying to export a torchscript model with your codes, Do you have an example for custom model? I modify your yolov5s.yaml nc
parameter to my model's nc
, and change your trace_model.py script to following.
model = yolov5s(pretrained=False)
model.eval()
model = model.load_state_dict(torch.load("custom.pt", map_location="cpu"))
traced_model = torch.jit.script(model)
traced_model.save("./yolov5s.torchscript.pt")
however it shows the following errors.
AttributeError: Can't get attribute 'Model' on <module 'models.yolo' from '/Users/test_user/yolov5-rt-stack/models/yolo.py'>
@luvwinnie Sure, there is minor difference comparing ultralytics's yolov5 to my yolov5rt, here is a guide to convert ultralytics/yolov5 to yolov5rt.
@zhiqwang it seems I have some errors with your repos with my models. @@ Seems like it can't convert the weights properly.
RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 0
@luvwinnie , Is it convenient for you to send me your model weights, so that I could reproduce this bug, my email address is me@zhiqwang.com .
CUDA: 10.2 CUDNN: 7.6.5 Triton Inference Server(Docker): nvcr.io/nvidia/tritonserver:20.12-py3
The name of model file name must be model.onnx and the config file name must be config.pbtxt
models/
โโโ model_onnx
โโโ 1
โย ย โโโ model.onnx
โโโ config.pbtxt
Run with following command.
$ docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v`pwd`/models:/models nvcr.io/nvidia/tritonserver:20.12-py3 tritonserver --model-repository=/models --strict-model-config=false --log-verbose 5
torch==1.6.0
torchvision==0.7.0
onnx==1.7.0
onnxruntime-gpu==1.4.0
tritonclient==2.6.0
For triton inference client package can be install with the following command.
$pip install tritonclient[all]
For people who want to export onnx model with variable input length and variable outputs for Triton Inference Server you can export the model by changing the export.py following lines for exporting GPU FP16 ONNX model.
# export.py
...
model.model[-1].export = True # set Detect() layer export=True
model.cuda()
model.half()
y = model(img.cuda().half()) # dry run
# y = model(img) # dry run
# [print(x.shape) for x in y]
# TorchScript export
...
# ONNX export
try:
import onnx
print('\nStarting ONNX export with onnx %s...' % onnx.__version__)
f = opt.weights.replace('.pt', '.onnx') # filename
torch.onnx.export(model, img.cuda().half(), f, verbose=False, opset_version=12, input_names=['input'],
output_names=['head1', 'head2',"head3"],dynamic_axes={
'input':{0:"batch_size",2:"height",3:"width"},
'head1':{0:"batch_size",2:"a",3:"b",4:"c"},
'head2':{0:"batch_size",2:"a",3:"b",4:"c"},
'head3':{0:"batch_size",2:"a",3:"b",4:"c"}
})
...
And you can use the model on Triton Inference Server with this config file. For more about configpb.txt check the Triton Inference Server Documentation
# configpb.txt
name: "model_onnx"
platform: "onnxruntime_onnx"
max_batch_size: 8
input {
name: "input"
data_type: TYPE_FP16
format: FORMAT_NCHW
dims: [3,-1,-1]
}
output [
{name: "head1"
data_type: TYPE_FP16
dims: [3,-1,-1,-1]
},
{name: "head2"
data_type: TYPE_FP16
dims: [3,-1,-1,-1]
},
{name: "head3"
data_type: TYPE_FP16
dims: [3,-1,-1,-1]
}
]
instance_group [
{
count: 1
kind: KIND_GPU
}
]
Example Triton inference server Client which need the NMS post-processing in this issue demo.zip which including the demo_onnx.py. In this demo_onnx.py, it has a wrong anchors . You need to modify the anchors orders with the following. This exported onnx model(variable inputs and variable outputs) has been tested on this demo_onnx.py which suppose to be work expectedly.
# demo_onnx.py
...
# anchors = [[116, 90, 156, 198, 373, 326], [30, 61, 62, 45, 59, 119], [10, 13, 16, 30, 33, 23]] # 5s <-- remove this
anchors = [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]] # modify to this.
...
# trt_inference_client.py
import tritonclient.http as httpclient
from PIL import Image
image_src = Image.open("0.png")
resized = letterbox_image(image_src, (img_size_w, img_size_h)) # letterbox_image from demo_onnx.py
img_in = np.transpose(resized, (2, 0, 1)).astype(np.float16) # HWC -> CHW
img_in = np.expand_dims(img_in, axis=0)
img_in /= 255.0
triton_client = httpclient.InferenceServerClient(
url="localhost:8000", verbose=False)
input_name = 'input'
model_name="model_onnx"
model_version="1"
model_metadata = triton_client.get_model_metadata(
model_name=model_name, model_version=model_version)
model_config = triton_client.get_model_config(
model_name=model_name, model_version=model_version)
print(model_config)
inputs = []
inputs.append(httpclient.InferInput(input_name, predict_image.shape, 'FP16'))
inputs[0].set_data_from_numpy(img_in, binary_data=True) #binary_data need to be True when using FP16
outputs = []
outputs.append(httpclient.InferRequestedOutput("head1"))
outputs.append(httpclient.InferRequestedOutput("head2"))
outputs.append(httpclient.InferRequestedOutput("head3"))
response = triton_client.infer(model_name,inputs=inputs, outputs=outputs)
print(response.as_numpy("head1").shape)
print(response.as_numpy("head2").shape)
print(response.as_numpy("head3").shape)
If I have time I would like to make a PR for adding usage on Triton Inference Server
Thank you so much! I will deploy onnx model on mobile devices!
Hello,
Can I see the code for uploading the model converted to .onnx to the mobile device?
Did you use onnx runtime?
Thank you.
Hello everyone! For ONNX model, it seems like for example it will output the following shape for (320,320) image. I'm trying to reproduce the exactly same result in detect.py. Does anyone can help me understanding how is the Detect() layer doing the preprocessing on these three heads?
out_shape: torch.Size([1, 3, 40, 40, 6])
out_shape: torch.Size([1, 3, 20, 20, 6])
out_shape: torch.Size([1, 3, 10, 10, 6])
Let's say that for image with (1,192, 320,3), it outputs a ([1, 3780, 6]) on the Detect() layer. How to do postprocress the three heads to become the correct ([1, 3780, 6])?
is it working with onnx.js>
is it working with onnx.js>
Hi @FahriBilici , it could be working with onnx.js, but I didn't find a good example :(
When I tried to run: python models/export.py --weights yolov5s.pt --img 640 --batch 1 # export at 640x640 with batch size 1 I got the following message: AttributeError: Can't get attribute 'SiLU' on <module 'torch.nn.modules.activation' from '/opt/anaconda3/lib/python3.8/site-packages/torch/nn/modules/activation.py'>
How to solve it?
Converting Frontend ==> MIL Ops: 3%|โ | 21/620 [00:00<00:00, 796.96 ops/s] CoreML export failure: unexpected number of inputs for node x.2 (_convolution): 13 Does Anyone know how to solve this problem? When I am using the following code to export a model: python models/export.py --weights yolov5s.pt --img 640 --batch 1 # export at 640x640 with batch size 1
@glenn-jocher I have a question. Can you help me to explain it Why is there twice as much memory storage after the model expert
Hi @Olalaye
The model are saved to half precision as default, so their storage is half of the others.
๐ This guide explains how to export a trained YOLOv5 ๐ model from PyTorch to ONNX and TorchScript formats. UPDATED 8 December 2022.
Before You Start
Clone repo and install requirements.txt in a Python>=3.7.0 environment, including PyTorch>=1.7. Models and datasets download automatically from the latest YOLOv5 release.
For TensorRT export example (requires GPU) see our Colab notebook appendix section.
Formats
YOLOv5 inference is officially supported in 11 formats:
๐ก ProTip: Export to ONNX or OpenVINO for up to 3x CPU speedup. See CPU Benchmarks. ๐ก ProTip: Export to TensorRT for up to 5x GPU speedup. See GPU Benchmarks.
export.py --include
yolov5s.pt
torchscript
yolov5s.torchscript
onnx
yolov5s.onnx
openvino
yolov5s_openvino_model/
engine
yolov5s.engine
coreml
yolov5s.mlmodel
saved_model
yolov5s_saved_model/
pb
yolov5s.pb
tflite
yolov5s.tflite
edgetpu
yolov5s_edgetpu.tflite
tfjs
yolov5s_web_model/
paddle
yolov5s_paddle_model/
Benchmarks
Benchmarks below run on a Colab Pro with the YOLOv5 tutorial notebook . To reproduce:
Colab Pro V100 GPU
Colab Pro CPU
Export a Trained YOLOv5 Model
This command exports a pretrained YOLOv5s model to TorchScript and ONNX formats.
yolov5s.pt
is the 'small' model, the second smallest model available. Other options areyolov5n.pt
,yolov5m.pt
,yolov5l.pt
andyolov5x.pt
, along with their P6 counterparts i.e.yolov5s6.pt
or you own custom training checkpoint i.e.runs/exp/weights/best.pt
. For details on all available models please see our README table.๐ก ProTip: Add
--half
to export models at FP16 half precision for smaller file sizesOutput:
The 3 exported models will be saved alongside the original PyTorch model:
Netron Viewer is recommended for visualizing exported models:
Exported Model Usage Examples
detect.py
runs inference on exported models:val.py
runs validation on exported models:Use PyTorch Hub with exported YOLOv5 models:
OpenCV DNN inference
OpenCV inference with ONNX models:
C++ Inference
YOLOv5 OpenCV DNN C++ inference on exported ONNX model examples:
YOLOv5 OpenVINO C++ inference examples:
TensorFlow.js Web Browser Inference
Environments
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
Status
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.