Open VoVAllen opened 5 years ago
I would say the major pro of pybind11 is composibility with any package that uses pybind11, including custom-developed packages. This essentially allows users to develop a C routine and directly "plug in" to DGL without compiling the full DGL source (like PyTorch-scatter plugging into PyTorch).
I think this is particularly important for customized neighborhood sampler implementation, as allowing easy integration of third-party C neighborhood samplers should be a good idea.
I would say the major pro of pybind11 is composibility with any package that uses pybind11, including custom-developed packages. This essentially allows users to develop a C routine and directly "plug in" to DGL without compiling the full DGL source (like PyTorch-scatter plugging into PyTorch).
I think this is particularly important for customized neighborhood sampler implementation, as allowing easy integration of third-party C neighborhood samplers should be a good idea.
It should be good for any kind of third-party C samplers (neighborhood, negative, etc.)
Changing FFI requires quite a bit effort, so if it does not block any usability feature, I would turn it down at the moment. Could anyone describe the user experience of writing a custom C routine if DGL were to use pybind11? E.g., what are the dependencies? how does it plug into DGL?
pybind11 has similar API of registering as the current one
#include <pybind11/pybind11.h>
int add(int i, int j) {
return i + j;
}
PYBIND11_MODULE(example, m) {
m.doc() = "pybind11 example plugin"; // optional module docstring
m.def("add", &add, "A function which adds two numbers");
}
and can be called in python like
import example
example.add(1, 2)
It's a header-only library, and can be added to current cmake easily. One main reason for this as mentioned by @BarclayII, is that we hope to enable user to write their own C++ sampling algorithm when needed, without compiling all the DGL codes, as how pytorch did for c++ custom op.
I think our code base is not so big that we could afford this transformation. PyTorch and tf had much bigger code bases and they still decided to do so. One thing I think needs investigation is how to make this transformation implemented gradually, which makes it easier to debug and ensure the correctness.
is that we hope to enable user to write their own C++ sampling algorithm when needed, without compiling all the DGL codes, as how pytorch did for c++ custom op.
Could you give a step-by-step example? For example, does it only require a DGL header library? Does it require link to DGL library during compilation? If DGL is installed by conda/pip, how does it work?
I guess a more appropriate example worth inspecting would be: how to, and whether it is worthwhile (in terms of performance overhead maybe) to replace any current TVM-style binding (e.g. _CAPI_DGLGraphHasEdgesBetween
) with PyBind11.
I found the above non-trivial to do at the first glance as it at least requires thinking of (1) how to expose Graph
objects, and (2) how to deal with NDArray
objects.
This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you
Proposal
pybind11 is a lightweight header-only library that exposes C++ types in Python and vice versa, mainly to create Python bindings of existing C++ code.
Many projects now use this, including:
Pros of pybind11 over current ffi:
py::list
has the same API as Python, which is powerful than currentList
Cons of pybind11 over current ffi:
Generally speaking, pybind11 is easier to use, and possibly could bring us better performance, due to GIL issue and bypass multiple python calls comparing to current implementations.