tomerfiliba / plumbum

Plumbum: Shell Combinators
https://plumbum.readthedocs.io
MIT License
2.81k stars 183 forks source link

Plumbum should have asyncio support #530

Open Callek opened 3 years ago

Callek commented 3 years ago

I feel that being able to use await/etc syntax with plumbum would be helpful in a lot of areas, especially for remote machine use. I may be able to work on implementing this in my work time but would love insight on if this would be accepted by the maintainer and any thoughts on how this should be implemented.

Callek commented 3 years ago

@henryiii - What do you think?

henryiii commented 3 years ago

I've thought about this before, but haven't had the time or enough familiarity with async (seems like I never get past making simple examples and demos, it hasn't quite been enough for ). Ideally, I assume this would be useful either as a base or a guide: https://docs.python.org/3/library/asyncio-subprocess.html.

I have a few long term plans (but not enough time to do them): I'd like to break off at least plumbum.cli and plumbum.colorlib into sub packages that are available separately, I'd like to see if paths can be refactored to use pathlib (with remote paths being a subclass), and somewhere in this, after 1.7, we could drop Python 2 support as long as python_requires is handled correctly (I'd be okay with 3.6+). I could see async, pathlib, typing, and refactoring of commands (I think we recreate things that might be able to use high level constructs in modern Python) as a nice 2.0 project.

Callek commented 3 years ago

If I'm reading this right you are considering the async support replacing the sync support, for a 2.0 after the 1.7 stuff is done?

henryiii commented 3 years ago

It would be in addition to it, not a replacement - in fact, it seems possible to detect if something is being called in await, so it might be possible to be pretty elegant in allowing both. I was referring more to the fact we could possibly keep the backends from diverging too badly by updating the sync backend to look like the async backend (assuming we can use the higher level abstractions now).

henryiii commented 3 years ago

If you want to draft a basic proposal and/or proof of concept PR, I'd be happy to review and discuss.

Callek commented 3 years ago

Skimming the code so far, it looks like some of the issues with implementing this is going to be almost a rewrite.

For example, we should ideally allow await Foo()... for any call that would utilize a network request, that said even ShellSession does a connection and a run from __init__ and its used directly from RemoteMachine's init.

Do you have any suggestions on what the base areas of code would be that I should work on updating or changing here?

Callek commented 3 years ago

@henryiii

So my basic proposal would be for async support to have init without side affects as part of the design. This would be a breaking change even for sync users...

Though we can move the side affects to many common uses, e.g. contextmanager could do the connection init does.

I would likely want to help with the earlier refactoring bits ahead of async, such as the plumbum.colorlib and plumbum.cli stuff you mentioned.

Would you like that in its own repo or side by side in this repo?

henryiii commented 3 years ago

How about we develop them in plumbum.experimental...? Then we don't even have to worry about them being importable in Python 2 and can do proper type hints and such. ;)

henryiii commented 3 years ago

I'm going to work on #531 and maybe merge #513, then release a minor update. Next we can start working on pulling out colorlib and cli, and that should go into 1.9. Then 2.0 could be the big update with Py 3+ required, etc. But stuff can live in .experimental until then?

Callek commented 3 years ago

@henryiii so I think before getting too deep on the new async support, it would benefit me to have some guidance on what API designs you consider essential/must have. Where you consider the async boundaries should be, and how we can implement it to cope.

For example I think a lot of plumbums use of foo['command'] type syntax is going to get thrown out, as is a lot of implicit connections that exist....

That said, I have started on a bit of a mockup, barely into it pending a chance to discuss the above with you, or find a good sync place/time to chat.

So far this is lacking a bunch:

henryiii commented 3 years ago

Most important for starting out is the API. What will it look like in use? Most of foo['command'] is lazy; nothing happens until you use & FG or () or something like that.

I'll try to push 1.7.0 soon.

henryiii commented 3 years ago

Personally, I like writing a bit of docs-like (or sometimes tests-like) material first, something like this:

from plumbum import cmd

async with cmd.ls:
    pass

I'd start with what would be absolutely ideal, and then it can be made a bit more practical as the implementation comes together. For example, in the above expression, maybe it turns out that we need an explicit method to return the async context manager for foreground/background, so we'd eventually need async with cmd.ls.afg() or something. Etc.

I'd pick a couple of simi-real world examples, one local and one remote.

Feel free to write nicer test files for new tests, we don't have to keep the old class structure; that was due to a conversion from unittest style testing. Simpler tests + fixtures are much nicer. :)

henryiii commented 2 years ago

FYI, Python 2 has been removed, and I'm working on cleaning up the code a bit. Static types are starting to show up, etc. :)

yajo commented 3 months ago

IMHO a better interface would be:

from plumbum import cmd
result = await cmd.ls

I'm really hungry for this feature. Any ETA or roadmap or whatever? 🙏🏼 🥺

yajo commented 3 months ago

FWIW another possibility would be this:

from plumbum import cmd
result = await cmd.ls.async_subprocess()

That method could call asyncio.create_subprocess_shell or asyncio.create_subprocess_exec behind the scenes and just return the awaitable object, so we can await for it later.

This way you don't have to think too much about how current non-async code works, as it'd go a different path. Later, we could provide a nicer interface. It would be similar to what cmd.ls.popen() does, but for the async world.

Callek commented 3 months ago

I'm really hungry for this feature. Any ETA or roadmap or whatever? 🙏🏼 🥺

I'm no longer a plumbum user as I've changed orgs with very different requirements, so it's no longer on my plate to do this. (But maybe the actual maintainer is interested)

henryiii commented 3 months ago

It's much easier now that we are 3.8+. I think something explicit (like the second example) would be better; while you can detect an async context, it's not very friendly; we could always add it later, as you mentioned. I can help with a contribution, but am not planning on doing this myself, at least in the near future.