chainer / onnx-chainer

Add-on package for ONNX format support in Chainer
MIT License
85 stars 24 forks source link

NNVM example fails with `NNVMError` #152

Open hyabe opened 5 years ago

hyabe commented 5 years ago

When I run NNVM example on TVM 0.5 (the latest release)/ChainerCV 0.12.0/Chainer 6.0.0rc1, it fails as follows:

Traceback (most recent call last):
  File "export.py", line 76, in <module>
    main()
  File "export.py", line 65, in main
    save_as_onnx_then_import_from_nnvm(model, 'vgg16.onnx')
  File "export.py", line 30, in save_as_onnx_then_import_from_nnvm
    sym, params = nnvm.frontend.from_onnx(model_onnx)
  File "/home/xxxxxxxx/anaconda3/envs/work/lib/python3.6/site-packages/nnvm/frontend/onnx.py", line 984, in from_onnx
    sym, params = g.from_onnx(graph, opset)
  File "/home/xxxxxxxx/anaconda3/envs/work/lib/python3.6/site-packages/nnvm/frontend/onnx.py", line 839, in from_onnx
    op = self._convert_operator(op_name, inputs, attr, opset)
  File "/home/xxxxxxxx/anaconda3/envs/work/lib/python3.6/site-packages/nnvm/frontend/onnx.py", line 940, in _convert_operator
    sym = convert_map[op_name](inputs, attrs, self._params)
  File "/home/xxxxxxxx/anaconda3/envs/work/lib/python3.6/site-packages/nnvm/frontend/onnx.py", line 102, in _impl_v1
    custom_check=dimension_constraint())(inputs, attr, params)
  File "/home/xxxxxxxx/anaconda3/envs/work/lib/python3.6/site-packages/nnvm/frontend/common.py", line 139, in __call__
    return get_nnvm_op(op_name)(*inputs, **new_attrs)
  File "/home/xxxxxxxx/anaconda3/envs/work/lib/python3.6/site-packages/nnvm/_ctypes/symbol.py", line 197, in creator
    ctypes.byref(sym_handle)))
  File "/home/xxxxxxxx/anaconda3/envs/work/lib/python3.6/site-packages/nnvm/_base.py", line 91, in check_call
    raise NNVMError(py_str(_LIB.NNGetLastError()))
nnvm._base.NNVMError: Cannot find argument 'storage_order', Possible Arguments:
----------------
pool_size : , required
    Size of the pooling windows..
strides : , optional, default=[1,1]
    Specifies the strides of the convolution.
padding : , optional, default=[0,0]
    If padding is non-zero, then the input is implicitly zero-paddedPadding support both symmetric and asymmetric asone int : same padding used on all sidestwo int : bottom, right will use same padding as top, leftfour int : padding width in the order of (top, left, bottom, right)
layout : string, optional, default='NCHW'
    Dimension ordering of data and weight. Can be 'NCHW', 'NHWC', etc.'N', 'C', 'H', 'W' stands for batch, channel, height, and widthdimensions respectively. Convolution is applied on the 'H' and'W' dimensions.
ceil_mode : boolean, optional, default=0
    When true, will use ceil instead of floor to compute the output shape.
, in operator max_pool2d(name="", ceil_mode="False", strides="(2, 2)", storage_order="0", padding="(0, 0)", pool_size="(2, 2)")

There seems to be several complicated causes for this:

  1. NNVM lacks support for Opset > 1: ONNX-Chainer generates storage_order attribute for MaxPool operator, which was introduced in Opset 8 and unknown to NNVM.
  2. Though opset_version=1 for onnx_chainer.export() can help the above, it makes arithmetic operators (e.g. Mul) to be non-broadcasting mode. ONNX-Chainer generates broadcasting-Mul for the last pooling of ResNet-50, which requires explicit broadcast=1 in Opset 1.
  3. NNVM's broadcasting arithmetic operators seem buggy: Even if Mul operator has broadcast=1, NNVM raises another exception for tensor-by-scalar operation. (I'm asking at the TVM community for this problem)
disktnk commented 5 years ago

Thank you for report, and sorry for late reply.

I could get same error, and set opset_version=7 option on export source, resolve it with VGG example. Please add the option on export when use TVM runtime.

Yet ResNet50 got another error.

Traceback (most recent call last):
  File "export.py", line 77, in <module>
    main()
  File "export.py", line 73, in main
    save_as_onnx_then_import_from_nnvm(model, 'resnet50.onnx')
  File "export.py", line 50, in save_as_onnx_then_import_from_nnvm
    module.set_input(**params)
  File "/root/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/contrib/graph_runtime.py", line 139, in set_input
    self._get_input(k).copyfrom(params[k])
  File "/root/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/ndarray.py", line 212, in copyfrom
    source_array.copyto(self)
  File "/root/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/ndarray.py", line 279, in copyto
    self.handle, target.handle, None))
  File "/root/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/base.py", line 72, in check_call
    raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: [08:54:24] /usr/tvm/src/runtime/ndarray.cc:150: Check failed: from_size == to_size (4 vs. 8192) TVMArrayCopyFromTo: The size must exactly match