Closed a550461053 closed 4 years ago
Hi @a550461053 , MobulaOP uses MXNet Python API to create operators https://github.com/wkcn/MobulaOP/blob/master/mobula/glue/mx.py#L133, and uses TVMBridge
to register asychronous functions.
TVM PackedFunc
It is simple to use TVMBridge
to register asychronous functions. To address the issue of ABI compatibility, I moved tvm_bridge.h into the 3rdparty, and call the API MXEnginePushSyncND
. This method doesn't need to rebuild MXNet.
NNVM API If using NNVM to register operators, it needs to rebuild MXNet.
Performance
The overheads of MobulaOP are on the Python code and the implementation of mx.sym.CustomOp
, which uses multi-thread to execute each registered op. Since MobulaOP enables asynchonous computation, the time on Python Code will be covered by the computational time.
There is a better approach to register custom op now: https://github.com/apache/incubator-mxnet/tree/master/example/extensions/lib_custom_op
Hi @a550461053 , MobulaOP uses MXNet Python API to create operators https://github.com/wkcn/MobulaOP/blob/master/mobula/glue/mx.py#L133, and uses
TVMBridge
to register asychronous functions.
- TVM PackedFunc It is simple to use
TVMBridge
to register asychronous functions. To address the issue of ABI compatibility, I moved tvm_bridge.h into the 3rdparty, and call the APIMXEnginePushSyncND
. This method doesn't need to rebuild MXNet.- NNVM API If using NNVM to register operators, it needs to rebuild MXNet.
- Performance The overheads of MobulaOP are on the Python code and the implementation of
mx.sym.CustomOp
, which uses multi-thread to execute each registered op. Since MobulaOP enables asynchonous computation, the time on Python Code will be covered by the computational time.There is a better approach to register custom op now: https://github.com/apache/incubator-mxnet/tree/master/example/extensions/lib_custom_op
Thank you very much, I update my mxnet to 1.6.0b20191102 and try the approach: https://github.com/apache/incubator-mxnet/tree/master/example/extensions/lib_custom_op return the error:
MXNet version 10500 supported
--------start ndarray compute---------
Traceback (most recent call last):
File "test_gemm.py", line 41, in <module>
print(mx.nd.my_gemm(a,b))
AttributeError: module 'mxnet.ndarray' has no attribute 'my_gemm'
I also see that objdump: /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so: File format not recognized
. I will make a issue to MXNet.
But when I use MobulaOP, it's ok. Can you explain why it happen and what's different between MobulaOP and MXNet's new approach of registering custom op without rebuilding MXNet.
MXNet's new approach of registering Custom Op The new approach was introduced in Dec. 6, 2019, so MXNet 1.6.0b20191102 is not supported. It works for MXNet whose version>=1.6.0b20191207
objdump error Sorry that I couldn't reproduce the issue.
what's different between MobulaOP and MXNet's new approach of registering custom op
MobulaOP was written before MXNet's new approach.
MobulaOP uses the MXNet API MXEnginePushSyncND
and the Python API mx.sym.CustomOp
to register operators.
The overhead is on the Python code of MobulaOP and MXNet Python CustomOp. In addition, MXNet Python CustomOp uses multi-thread to execute each Op.
The benefit is that it is easier to write code and call other MXNet built-in Op, due to using mx.sym.CustomOp
.
MXNet's new approach provides a C API to register custom operator, and the whole procedure is written by C++. It is suitable for deployment and faster than MobulaOP.'
These two approaches don't need to rebuild MXNet, and there is no ABI compatibility problem.
- MXNet's new approach of registering Custom Op The new approach was introduced in Dec. 6, 2019, so MXNet 1.6.0b20191102 is not supported. It works for MXNet whose version>=1.6.0b20191207
- objdump error Sorry that I couldn't reproduce the issue.
- what's different between MobulaOP and MXNet's new approach of registering custom op MobulaOP was written before MXNet's new approach. MobulaOP uses the MXNet API
MXEnginePushSyncND
and the Python APImx.sym.CustomOp
to register operators. The overhead is on the Python code of MobulaOP and MXNet Python CustomOp. In addition, MXNet Python CustomOp uses multi-thread to execute each Op. The benefit is that it is easier to write code and call other MXNet built-in Op, due to usingmx.sym.CustomOp
. MXNet's new approach provides a C API to register custom operator, and the whole procedure is written by C++. It is suitable for deployment and faster than MobulaOP.'These two approaches don't need to rebuild MXNet, and there is no ABI compatibility problem.
Thanks, the MobulaOP use C++ to define all kernel function, the python code's overhead is nothing for the use of MXEnginePushSyncND, and the only overhead is the MXNet Python CustomOp. Does it right?
Yes. The overhead includes MXNet Python CustomOp, and some preprocessing of MobulaOp (e.g. check the input type, find the custom function).
Thank you very much!
May I ask what implementation of creating operator in our MobulaOP: