Closed wangkuiyi closed 7 years ago
boost.python
is simpler than SWIG, but it seems that it only supports python while SWIG supports more language wrappers.
For sure, we may only support Python in a long time, consider time and workload, but some choices need to be considered:
boost.python
, it is hard to support multiple language wrappers.boost.python
is a more native wrapper for Python compared to SWIG, so if we use boost.python
, it is easier to make a python-first system, in other words, better Python APIs (something like defining customize op/layer in Python? ) than TF/caffe2.In short, boost.python
for creating better Python APIs like PyTorch while SWIG
for multiple language wrappers.
Boost.Python. This is some C++ templates that simplify the manual wrapping. We no longer need to write a C wrapper function for each C++ function. Only a few extra lines in addition to the original C++ code are required to build a .so that can be called from Python.
Tensor
s and Op
s in pserver side to do remote parameter optimization.So I think we can't avoid making a c wrapper at last, once we have it, implement python extension is simple.
What is supposed to be included in the Go API?
I my mind, the Go API differs from the Python API significantly. In particular, we don't need Go packages like paddle.op, paddle.layer, paddle.scope, paddle.variable. Is this correct? @typhoonzero
There are another lib https://github.com/pybind/pybind11
that just work like Boost.python, but is much more lightweight, If we will consider Boost.python, we can also have a look at this lib.
What is supposed to be included in the Go API?
At least paddle.variable or paddle.tensor I think. The current implementation of go pserver and optimizer use an independent implementation of tensor, at paddle/optimizer/tensor.h
, better to use the new implement so we don't have two "tensor"s in the code base.
We don't need Go packages like paddle.op, paddle.layer, paddle.scope, paddle.variable indeed. Making parameter server as an "op" like tensorflow isn't what we intended to.
consider that we just write operator in C++ and generate them in python. We cannot find a proper language binding generator library/generate technical. Maybe it is too hard to generate OP for Go at present.
I think we can just build a core system in c++, strong binding with python. other languages invoke functions from C-API binding.
Agree with @jacquesqiao, according to pybind11's doc pybind11 pybind11 Similar to Boost.Python, but with a lean header-only implementation for C++11-capable compilers.
It may be a better choice if we only consider the python c++ binding things.
ctypes
is not a good choice. Not only we need to write every function again in python side, but also it makes python binding tedious to maintain/upgrade. e.g., mxnet, choose ctypes
in the very beginning, but they maintain another logic in cython
nowadays.
Whether uses C-API depends on is there any other languages need invoke Paddle C++ Core or not.
I am not sure only Python API is enough or not. At least there are several needs for us to give a C-API.
Also, I think pybind11 is better than boost::Python
because Paddle is in C++ 11.
But if we have a C-API for Paddle, wrap that C-API to Python is extremely easy by Cython.
cdef extern from "math.h":
double sin(double x)
Also, pybind11
and boost::Python
has a very major defect. It enforces the compiling Python version and running Python version EXACTLY SAME. It means if Paddle is compiled with Python 2.7.2
but run with Python 2.7.3
, an error will be raised.
See video here
@wangkuiyi and all,
I write two demos, one used pybind11
, other used Cython+C-API
. They are:
The conclusion is:
If we don't want a C-API, PyBind11
is simpler. I think PyBind11
has a great design of interface personally. I barely do not have to look at its documents to develop this demo. But there is one thing need to be careful, the ownership of C++ object.
If we have a C-API, use Cython
is extremely easy. Just include the header and write some interface code. Cython
has a great interface, too(See here). However, developing a C-API is noising.
Fixed by #2793
I read and followed this article http://intermediate-and-advanced-software-carpentry.readthedocs.io/en/latest/c++-wrapping.html, which compares the following interfacing technology:
manual wrapping. I followed this official Python document for more details: https://docs.python.org/2/extending/extending.html for some example programs. There includes some complex boilerplate code -- parsing argument in each C function, build and return Python object in each C function, and the method list.
SWIG. It seems a general method that can generate bindings for various client languages, but not "native" enough. Also, it takes some time to learn the interfacing language (*.i files).
ctypes. This requires us to respecify the return type and other meta-data about each C function at the Python side, again.
SIP. This is the Qt community's version of SWIG. We also need to learn an interfacing language.
Boost.Python. This is some C++ templates that simplify the manual wrapping. We no longer need to write a C wrapper function for each C++ function. Only a few extra lines in addition to the original C++ code are required to build a .so that can be called from Python.
I personally prefer Boost.Python. Here is an example for your reference:
Suppose that we already have C++ functions like:
only the following few lines is required to build the Python-callable .so file: