Implementing new operators in an external module

analog-cbarber commented 6 years ago

Currently it is possible to implement new operators in C++/CUDA, but it requires you to have your own copy of the source tree and build it into the standard mxnet library.

The only way to share such operators with others is to either share your fork of the MXNet repo or to contribute the operator back to the mainline MXNet project.

This is not scalable. It makes it very difficult to use operators from more than one fork, and it requires developers to build the entire source rather than just the source for their operators. It also requires developers of custom operators to manually merge from the parent MXNet project everytime they want to take advantage of the latest features and bug fixes unrelated to their own operators.

Ideally, there should be a way to define new operators in a separate project that is built against header files and shared library from a released version of MXNet, and then be able to load them from various MXNet front-end languages without having to replace all of MXNet with a custom version.

This may already be possible if you are coding directly in C++, but if you are using Python the mxnet shared library is installed in the Python library directory and is loaded in local mode so that other libraries cannot link against it.

Instead the mxnet shared libraries could be installed in a global location, independent of the front end being used, and loaded in global mode so that external modules could link against it. Each front-end language should provide a way to load an external module and verify that it is compatible with the currently installed copy of mxnet.

zhanghang1989 commented 6 years ago

Agree. A feature similar to PyTorch FFI is desired.

analog-cbarber commented 6 years ago

TensorFlow supports this. See https://www.tensorflow.org/api_docs/python/tf/load_op_library

DuinoDu commented 6 years ago

https://github.com/DuinoDu/load_op.mxnet

wkcn commented 6 years ago

I'm writing a similar project named MobulaOP. It's available to implement new operators in Python/C++/C/CUDA without rebuilding the source of deep learning framework.

It aims to write only one code to implement new operator on different deep learning framework and different devices.

For example, I write a ROIAlign code. The project will generate the related CUDA code automatically. And the ROIAlign operator will support CPU/GPU, MXNet, Numpy, PyTorch, etc.

eric-haibin-lin commented 6 years ago

Bump. Extensibility is very important

chenkelmann commented 3 years ago

It looks like this has been implemented in 1.7: https://github.com/apache/incubator-mxnet/releases/tag/1.7.0 It says there

"Adds support for extending MXNet with custom operators, partitioning strategies, and graph passes. All implemented in a library easily compiled separately from the MXNet codebase, and dynamically loaded at runtime into any prebuilt installation of MXNet."

So it looks like this issue could be closed. However, I could not find any tutorial on how to do such a build. This is a very complete tutorial on how the actual code needs to look like, but it seems to be missing information on how do this with an external library: https://mxnet.apache.org/versions/1.8.0/api/faq/add_op_in_backend

Any idea where I can find an example for an external custom operator?

chenkelmann commented 3 years ago

To answer my own question: there is an example here: https://github.com/apache/incubator-mxnet/tree/master/example/extensions/lib_custom_op

apache / mxnet

Implementing new operators in an external module #9547