apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

mxnet fail to import onnx model #18923

Open photoszzt opened 4 years ago

photoszzt commented 4 years ago

Description

mxnet fail to import onnx model exported by pytorch. The model is from https://github.com/biubug6/Pytorch_Retinaface

Error Message

(Paste the complete error message. Please also include stack trace by setting environment variable DMLC_LOG_STACK_TRACE_DEPTH=10 before running your script.)

  File "/disk/zhitingz/video-pipeline/sprocket/platform/lambda_program/onnxruntime/detection/detect.py", line 139, in mx_init
    sym, arg_params, aux_params = onnx_mxnet.import_model(mpath)
  File "/disk/zhitingz/mxnet-cpu/lib/python3.6/site-packages/mxnet/contrib/onnx/onnx2mx/import_model.py", line 59, in import_model
    sym, arg_params, aux_params = graph.from_onnx(model_proto.graph)
  File "/disk/zhitingz/mxnet-cpu/lib/python3.6/site-packages/mxnet/contrib/onnx/onnx2mx/import_onnx.py", line 115, in from_onnx
    inputs = [self._nodes[i] for i in node.input]
  File "/disk/zhitingz/mxnet-cpu/lib/python3.6/site-packages/mxnet/contrib/onnx/onnx2mx/import_onnx.py", line 115, in <listcomp>
    inputs = [self._nodes[i] for i in node.input]
KeyError: '1'

To Reproduce

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.) Get the exported onnx model. I fork the above link: https://github.com/photoszzt/Pytorch_Retinaface with some helper script to reproduce this issue easier.

  1. download the pretrain pytorch model and export the model
    cd weights
    ./download_all_pretrain.sh
    cd -
    ./export_models.sh
  2. try to load the model with the follwoing code.
    from mxnet.contrib import onnx as onnx_mxnet
    sym, arg_params, aux_params = onnx_mxnet.import_model('./mnet.0.25.onnx')

What have you tried to solve it?

  1. I try to print the node.input. It's ['input0', '1'].
github-actions[bot] commented 4 years ago

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly. If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

photoszzt commented 4 years ago

I update the issue with my forked repo that contains some helpful scripts to reproduce this issue.

szha commented 4 years ago

cc @josephevans and @ChaiBapchya

TristonC commented 3 years ago

@Zha0q1 Have MXNet 1.x made any progress for this issue?

Zha0q1 commented 3 years ago

no.. the recent onnx development has been focused on exporting rather than importing

TristonC commented 3 years ago

Thanks, @Zha0q1. Import might be more important if a user want to test the performance on different frameworks. @szha @sandeep-krishnamurthy Do we have any project to imporve the ONNX model importing?

Joe2loft commented 3 years ago

@photoszzt @Zha0q1 you can add args keep_initializers_as_inputs=True when you run torch.onnx.export function.
but after that, i meet another problem ----- mxnet dont have resize op . i found it's because there's a torch.nn.functional.interpolate function in retinaface, which will be transform to resize op by onnx, but mxnet dont have resize op. i tried to add a resize op , but i'm not familiar with mxnet.