awestlake87 / pyo3-asyncio

Other
312 stars 48 forks source link

Initialization fixes #30

Closed awestlake87 closed 3 years ago

awestlake87 commented 3 years ago

Ongoing work for #29

These changes unfortunately add more nuance to the conversions. It's more important to be aware of whether a conversion is taking place on a Python asyncio-aware thread, an async Rust task, or a plain Rust thread.

I do think these changes make the library more correct, but I also think that it's important to help existing users transition. There was some uncertainty in the discussions for #29 about whether or not a compatibility layer was warranted, but after implementing some of these changes I'm even more inclined to say that it's necessary.

The compat layer should make it much easier for downstream developers to transition to the newer, more correct, API. Ideally upgrading to 0.14 should not break existing users, but still encourage them to start using the more correct API .

ShadowJonathan commented 3 years ago

(can you maybe add me as reviewer to this PR? thanks for the work already though! I just cant judge this with one glance, its a bit of work)

awestlake87 commented 3 years ago

Yeah, the changes are a bit complex and they're currently in-progress, so feel free to take your time or ask questions. I might push a few more changes later today as well, but it's nowhere near ready to merge yet.

I assigned you to the PR, but I have to admit I'm not entirely sure how to make you a reviewer. Do you need to be added as a maintainer of the project before I can add you as a reviewer?

Edit: It's looking like I can't add you as a reviewer without making you a collaborator on the project. I'm kinda hesitant to do so since there's not much granularity on the permissions for personal projects. I'm not entirely sure about the best practices for this either (I'm pretty new to public projects in general).

I believe you should be able to comment on specific lines and make changes in a fork. Is there anything else that needs reviewer permissions?

ShadowJonathan commented 3 years ago

It's looking like I can't add you as a reviewer without making you a collaborator on the project.

Weird, no need then, I just wanted to request it if possible as a todo item. No need if it turns out to be a bit difficult/uncomfortable.

Assigning works, thanks! I'll do an in-depth look at this later.

awestlake87 commented 3 years ago

Ok, I think I'm done making changes for today. From here on out it should mostly just be documentation work, bikeshedding names, figuring out the best way to go about deprecating 0.13, and any easy optimizations we can take advantage of.

I'm not super concerned about performance for this release since it's more of a bugfix, so I don't necessarily think we need to add the benchmarks just yet. I would imagine any optimizations we do will probably not have much effect on the public facing API, but let me know if you think I'm wrong about this.

I'll try to resume work on it sometime next week since I have a busy weekend planned. I should still have time to answer any questions on here or gitter if you have them.

Bengreen commented 3 years ago

I tried out this branch against a pytest based test with async support.

from setuptools_rust_starter import sleep_for, foo
import asyncio

async def test_python_sleep(loop):
    await asyncio.sleep(1)

async def test_rust_sleep_foo(loop):
    await foo()

async def test_rust_sleep(loop):
    await sleep_for(1)

After updating the rust syntax to match the new structs for building async libs this worked very well.

Would this be a useful test and example to incorporate into the library as I think using pytest would be a common proof that the library (rust based) is working.

ShadowJonathan commented 3 years ago

We should definitely have some tests that try to do "weird stuff", like the start-an-event-loop-on-a-different-branch as i mentioned earlier. Python has failure modes for those, and expectations of how those things work, maybe we should try to match them.

ShadowJonathan commented 3 years ago

I saw into_coroutine was going to be renamed to future_into_py, i'd recommend against this, as "coroutine" is a distinctly Python name, while future can apply to both runtimes. Keeping "future" a distinctive "rust" name and "coroutine" a distinctive "python" name would help with confusion, as into_future looks very name-similar to future_into_py at first glance, but they do entirely different things.

awestlake87 commented 3 years ago

Awesome @Bengreen, thanks for trying these changes out! Glad to hear they're working for you.

I think pytest is probably running asyncio.run for each test case (although I should double check that), so we should write up a test that does something similar to ensure we keep that functionality.

awestlake87 commented 3 years ago

We should definitely have some tests that try to do "weird stuff", like the start-an-event-loop-on-a-different-branch as i mentioned earlier. Python has failure modes for those, and expectations of how those things work, maybe we should try to match them.

I agree, we could use some more tests for these changes. Could you elaborate on this a bit? I couldn't find the comment you mentioned.

I saw into_coroutine was going to be renamed to future_into_py, i'd recommend against this, as "coroutine" is a distinctly Python name, while future can apply to both runtimes. Keeping "future" a distinctive "rust" name and "coroutine" a distinctive "python" name would help with confusion, as into_future looks very name-similar to future_into_py at first glance, but they do entirely different things.

I agree it's kinda weird, but the current naming doesn't reflect the impl that well which is why there was a long discussion on it in #4. That might help inform the reasoning for the naming changes. Let me know if you have any comments or disagreements about it on that thread. Nothing serious has been released yet that makes these names final (local_future_into_py has but I don't think many people are using it currently so changing it would not be the end of the world).

Here's my proposal;

  • Have every task-local reference to "their" event loop be an OnceCell<Arc<PyObject>> that holds a strong count to that object
  • Somewhere, there's a Vec of Weak<PyObject>s which keep track of seen event loops in the program, when all tasks exit, the weak reference cannot be brought up again.
    • Getting a value from this vec would be a three-state enum; None, One(Arc<PyObject>), Multiple(&[Arc<PyObject>]), forcing the caller to make a choice on how to handle the first and the last state.
  • Have a (Weak<PyObject>, PyObject) somewhere which holds a reference to the (Main OS thread? First?) Event loop, to (when the Weak becomes inaccessible) perform extra "health checks" on the event loop, to make sure it's still running, before "reviving" the weak reference (somehow?) and returning the main thread/first event loop to the caller.

These scenarios would handle asyncio weirdness eloquently, while it's most likely not needed for 95% of the uses, I think it's worth it to be exhaustive and correct for the 5% here.

I think the current impl does the first bullet point. I'm not sure I follow on the reasoning behind the next two. Are these potential performance optimizations or are they meant to address some other issue with the impl?

ShadowJonathan commented 3 years ago

I'm sorry for holding up these changes, I'll try to make time for this today.

awestlake87 commented 3 years ago

Sorry for the late response on this. I appreciate the review @ShadowJonathan! It's always good to get a second set of eyes on this. Feel free to open up that follow-up whenever!

I'm gonna bump the pyo3 version to 0.14 to help @sansyrox and make sure there are no issues, then start working toward marking off the documentation and testing items in the coming days. Hopefully I can get this wrapped up and released fairly soon!

sansyrox commented 3 years ago

Thank you @awestlake87 for this! :smile:

awestlake87 commented 3 years ago

Just a heads up @sansyrox, I just merged in #33 which may break your code if you're using tokio. #33 makes the tokio initialization lazy, so as long as you're using multithreaded tokio, the only thing you need to do is get rid of the call to pyo3_asyncio::tokio::init_multithread() and things should work fine.

ShadowJonathan commented 3 years ago

Quick glance through documentation looks good 👍

awestlake87 commented 3 years ago

Thanks! Unfortunately it's not done yet. That was just me integrating the existing pyo3 async/await guide in and making the necessary updates to it. I still need to add a migration guide and improve the recommendations for native modules / alternate event loops / caveats etc.

Not sure if I'll be able to get to all of that today, but after I write that up it'd be awesome to get a review on that too!

awestlake87 commented 3 years ago

Alright, I'm gonna take a break for now for some family stuff. All that's left to do is provide a small migration guide and open a PR on pyo3 to update the guide (which will probably just be most, if not all, of the primer). I'll try to wrap this up later tonight.

I think the Primer in the README should be a good enough justification for the 0.14 changes so the migration guide shouldn't take too long. I think we can probably get this all wrapped up and released by tomorrow (let me know if you need some more time to review it though @ShadowJonathan).

ShadowJonathan commented 3 years ago

I think we can probably get this all wrapped up and released by tomorrow (let me know if you need some more time to review it though @ShadowJonathan).

Just reviewed it, I think this is good enough for release, I wouldn't call this 1.0 ready (my brain demands that all edge-cases are solved first), but this is certainly far better than the assumptions made before this change, and thus a lot more resilient to more common "weird" use-cases.

awestlake87 commented 3 years ago

Thanks! I appreciate it.

I think this is a major improvement over what we had before, although I'm a bit unhappy that it's going to be such a disruptive change. Not much we can do about it though. It's best to rip the band-aid off early on IMO.

I wouldn't call this 1.0 ready (my brain demands that all edge-cases are solved first), but this is certainly far better than the assumptions made before this change, and thus a lot more resilient to more common "weird" use-cases.

For sure, I think there's still room for improvement. I'm not entirely confident we can cover those edge cases practically, but I'm open to ideas on that. I think it helps at least for now that we provide the *_with_loop variants for those special cases. It's at least possible for users to work around these issues if it comes up.

awestlake87 commented 3 years ago

Planning on doing the full doc review sometime tomorrow.

I think this PR is ready to merge, but I'm going to hold off for just a little while longer in case anyone else has any last-minute comments or changes they want to make.

awestlake87 commented 3 years ago

Alright, I've got to run an errand real quick, but I'll get back to this in a couple hours.

I think what's left to do is:

The README is pretty monolithic now, which is not great. I think the "Event Loop References and Thread-Awareness" section especially should go somewhere else (probably API docs).

@davidhewitt was talking about shortening the guide a bit and linking back to this repo for details, which I can get on board with. I like having the quickstart in the pyo3 guide to show examples for both applications and native extensions, but maybe some of the other details such as testing can stay in this README (or just link to the testing module the API docs).

awestlake87 commented 3 years ago

Alright, looks like things have calmed down on this so I think I'm going to go ahead and merge. I'll try my best to review the docs, check links etc and hopefully release 0.14 sometime this evening!