boostorg / python

Boost.org python module
http://boostorg.github.io/python
Boost Software License 1.0
468 stars 201 forks source link

WIP: GSoC21 - python eventloop using Boost.Asio strand #365

Open philoinovsky opened 3 years ago

philoinovsky commented 3 years ago

Proposal: https://github.com/philoinovsky/GSoC21/blob/main/5126362551025664_1617947181_Pan_Yue_GSoC21_Proposal.pdf

stefanseefeld commented 3 years ago

Do you have some document where you describe the design behind this change ? That would help me review the code.

philoinovsky commented 3 years ago

@stefanseefeld Sure, this is my gsoc proposal! https://github.com/philoinovsky/GSoC21/blob/main/5126362551025664_1617947181_Pan_Yue_GSoC21_Proposal.pdf

stefanseefeld commented 3 years ago

Thanks, that document provides a high-level overview of what the goal of your project is. What I'm looking for is something that summarizes the goal of this PR. I realize your PR is marked as Work In Progress, so it might in fact not be meant to be merged, or even reviewed. (You comment out a bunch of lines, which of course isn't acceptable for a real PR.) Just trying to understand the intent of this submission.

vinipsmaker commented 3 years ago

Hi @stefanseefeld,

could you enable CI tests on PRs so Pan can use the Boost infra to test this work?

This is a WIP and not yet ready to merge, but CI could help on the development. Hence the PR.

I'll be mentoring Pan and reviewing everything that pertains to Boost.Asio usage so you can assume this part will be covered by the time the WIP title/label is removed from the PR. Also we'll be pinging you periodically when certain design decisions show up as to make sure little friction appears when the PR is ready for merge. In the meantime there's little reason to invest time studying this early work (unless you want to).

stefanseefeld commented 3 years ago

Hi Vinicius,

On 2021-06-12 10:31 a.m., Vinícius dos Santos Oliveira wrote:

Hi @stefanseefeld https://github.com/stefanseefeld,

could you enable CI tests on PRs so Pan can use the Boost infra to test this work?

This is a WIP and not yet ready to merge, but CI could help on the development. Hence the PR.

I'll be mentoring Pan and reviewing everything that pertains to Boost.Asio usage so you can assume this part will be covered by the time the WIP title is removed from the PR. Also we'll be pinging you periodically when certain design decisions show up as to make sure little friction appears when the PR is ready for merge. In the meantime there's little reason to invest time studying this early work (unless you want to).

I have to admit that I'm a bit surprised that you are working on a project that you expect ultimately to be merged into the Boost.Python repo without involving me right from the start of the project. Don't you think this would have been appropriate ?

Anyhow, yes, I appreciate you including me in design review discussions.

Meanwhile, I have approved workflow / CI checks, so you should see the first test results soon.

Stefan --

   ...ich hab' noch einen Koffer in Berlin...
vinipsmaker commented 3 years ago

Don't you think this would have been appropriate ?

Yes. My bad.

without involving me right from the start of the project

So, what we're looking at here for the end goal is a simple way to enable Boost.Asio-based programs to reuse Python 3's asyncio-based libs on the same event loop. Python has a huge ecosystem (e.g. web crawlers) and it can help to bootstrap new C++ projects.

Asyncio is only a Python 3 thing, so Python 2 is out of the question. Also the scenario is a custom C++ program acting as the Python host so you have all the consequences that follow from that (e.g. do not generate an external C/C++ lib to be imported in a Python program). I have already looked at asyncio's event loop and I have a good idea on how it can be integrated on Boost.Asio. I'm sure Pan is a capable programmer for this task.

And another thing is the ability to use multiple Python interpreters in the same C++ host in the same vein as subinterpreters for Python. This feature would be a perfect fit for Boost.Asio's strands. It's not a magical solution and it won't work against every Python library, but that's expected. If the user wants to use multiple Python VMs in his program, that's on him (he'll just have to exclude dangerous Python libs from the pool of choices).

Again, the end goal is just an easy on/off switch that we can enable to magically setup a Python VM to use the implementation of our own asyncio implementation that defers all work to Boost.Asio. So ideally there shouldn't be many API additions. Ideally it should be just one function call that we expose in Boost.Python (and it can only be used against Python 3 obviously).

stefanseefeld commented 3 years ago

Thanks for the intro. Yes, I know Python's asyncio module, and in fact I'm using it heavily in my own faber tool (the one you are using to build this project right now :-) ) I understand the appeal to use Boost.Asio as backend for Python's async / await logic. In fact, at some point it may even be possible to use this as backend for C++'s own runtime support for C++20 coroutines.

As far as Boost.Python integration is concerned, it seems to me this should be a separate library, as it's orthogonal to language bindings. If I understand correctly, the new functionality you are proposing can be used as drop-in replacement for Python's own event loop (independent of whether C++ extension modules are being used), and likewise, normal Boost.Python users don't have to know what event loop implementation the Python runtime is using to power async / await support. Am I understanding correctly ?

vinipsmaker commented 3 years ago

As far as Boost.Python integration is concerned, it seems to me this should be a separate library, as it's orthogonal to language bindings. If I understand correctly, the new functionality you are proposing can be used as drop-in replacement for Python's own event loop (independent of whether C++ extension modules are being used), and likewise, normal Boost.Python users don't have to know what event loop implementation the Python runtime is using to power async / await support. Am I understanding correctly ?

Both approaches are valid, but it's ultimately Boost.Python's maintainer that has a say whether the new module should be a subdirectory under Boost.Python or a separate library entirely.

What's your say here? I'll trust your judgment no matter the answer.

In case you think a separate library would be more appropriate, would you be willing to act as a review manager when this library reaches the stage where it can be considered mature? I think it'll help tons judging by my experience when I last gauged interest on a new library that made use of pre-existing Boost libraries.

Mailing list members may not even look at the code if it has the same domain as a pre-existing library already in Boost. And yes, they'll just suggest to look for integration on the pre-existing library as they trust Boost maintainers that endured the review process already.

If I understand correctly, the new functionality you are proposing can be used as drop-in replacement for Python's own event loop

That's correct.

It has one small restriction, but that's minor and expected: If Python is a guest on a foreign event loop, then Python code shouldn't call run_*(). This function will throw an exception if called from Python as it's really the C++ host that will be calling the implementation for this function.

However this restriction shouldn't affect any library as libraries don't call run_*().

(independent of whether C++ extension modules are being used)

Correct.

Therefore I at least tend to agree that it's orthogonal to Boost.Python.

and likewise, normal Boost.Python users don't have to know what event loop implementation the Python runtime is using to power async / await support

Also correct.

stefanseefeld commented 3 years ago

On 2021-06-12 6:12 p.m., Vinícius dos Santos Oliveira wrote:

Both approaches are valid, but it's ultimately Boost.Python's maintainer that has a say whether the new module should be a subdirectory under Boost.Python or a separate library entirely.

What's your say here? I'll trust your judgment no matter the answer.

For avoidance of doubt: by "separate library" I didn't mean "separate project". Boost.Python right now already contains two libraries: boost_python and boost_numpy. Users who don't need NumPy support only need to use the former.

Similarly, I can imagine the new event loop backend to be compiled into a distinct library (still part of Boost.Python !), so users may only link to the libraries they actually need.

I can still imagine good reasons to keep the new functionality in Boost.Python, for example if some C++ extension modules make direct calls into the new API. We'll see whether that makes sense as the project matures and we gather experience prototyping with it.

vinipsmaker commented 3 years ago

Similarly, I can imagine the new event loop backend to be compiled into a distinct library (still part of Boost.Python !), so users may only link to the libraries they actually need.

Okay, let's roll with that for now then.

I'm not sure there'll be a library to link to though. Boost.Asio's flexibility really demands a header-only library for most of the time. At least strand will be a concept that can have multiple implementations (sometimes user-provided).

vinipsmaker commented 3 years ago

@stefanseefeld would it be okay to have a requirement for C++14 or later on this sublibrary? The integration code with Boost.Asio is already becoming too spaghetty for my taste and proper lambda expressions + move semantics would help tons on the implementation. Other Boost.Asio-based libraries usually have higher requirements anyway (e.g. Boost.Beast requires C++11), so we won't be really removing a significant audience here.

stefanseefeld commented 3 years ago

@vinipsmaker yes, that's definitely fine. (One more reason to have this new functionality in its own library, as that makes it easier to constrain a new library than an existing one !)

vinipsmaker commented 3 years ago

@stefanseefeld, we've been using Boost.Python's own abstractions to the C API where possible so far. However we'll be migrating to Py_NewInterpreter() calls soon (each subinterpreter will be protected by its own Boost.Asio strand) and I have a hunch that we'll be forced to migrate to the low-level C API.

So that's just a heads-up. We might migrate to the low-level C API internally in future interactions of the code.

stefanseefeld commented 2 years ago

hi @vinipsmaker , I hear the GSoC 2021 projects have completed successfully. What is the state of this PR ? Are you planning to prepare it for review / merge ?

vinipsmaker commented 2 years ago

hi @stefanseefeld

What is the state of this PR?

Well, most of the GSoC was spent on research, so it was successful. As for the implementation, it's not ready.

First of all, Python subinterpreters still share the same GIL, so the most exciting idea had to be abandoned. Supposedly this issue would be fixed by Python 3.10, but only time will tell. The implementation just abandoned the idea of subinterpreters and gone back to plain single global interpreter.

There are still concurrency issues to fix, and lots of features to add before the PR is useful. I wanted to keep mentoring this work after GSoC ended, but I'm very busy and I won't have the time this year. I hope I can mentor this project again next year and then we'll see.

@philoinovsky also learned a lot during the project, so maybe you'll want to hear one word or two from him.