Open hongchaodeng opened 4 months ago
The Pybind11 overhead is not neglegible in Ray, how about considering https://github.com/wjakob/nanobind that comes from the author of Pybind11 and claim to deliver 10x less overhead? That way we get benefit from an almost identical syntax with minor overhead. @rynewang @hongchaodeng
Some background https://nanobind.readthedocs.io/en/latest/why.html#why-another-binding-library
cc @dentiny
From a quick glance, this nanobind is similar to pybind11 in architecture, just with some optimizations?
From a quick glance, this nanobind is similar to pybind11 in architecture, just with some optimizations?
Exactly, it's the same author, nanobind dropped some historical technical debts so being more performant. And the author suggests using Nanobind instead of Pybind11 unless absolutely needed.
Can I understand as, the biggest gain is to leverage modern C++, since cython leverages C while pybind for C++?
Some bugs will only get caught in the second compilation pass, after Cython has generated thousands of lines of hard-to-decipher code.
For this point, I'm curious how pybind11 helps here? When wrap C++ code via pybind11, we could rely on compilation in one iteration to check; but it's hard to decipher as well, since pybind heavily rely on templates; When wrap (or say, load) python objects via pybind, everything should be at runtime.
I have some bad experience and memory for python/C++ FFI overall, for example,
gdb
and coredump show cpython internal call stack, same as flame graphJust curious, do you think it's possible to leverage localhost network instead of FFI? I definitely understand network call is one order of magnitude slower than FFI, since it involves more copies, but
pybind11::bytearray
-like data structure, which takes a copy: https://github.com/pybind/pybind11/blob/f46f5be4fa4d24c4e5382d0251315f361ce97424/include/pybind11/pytypes.h#L1766-L1791
Description
Problem
Currently we use Cython as the glue layer between Python and C++ (core worker) code. This has several problems.
Cython was better suited for creating simple wrapper around C code. But in current architecture, the code is complex and some use cases are beyond Cython design.
Here are some pain points:
Proposal
Proposing to use pybind11 to replace Cython bindings. It has the following benefits:
We can do this incrementally.
Use case
No response