dabeaz / curio

Good Curio!
Other
4.01k stars 240 forks source link

Add support for sniffio #278

Closed smurfix closed 5 years ago

njsmith commented 5 years ago

Context: https://sniffio.readthedocs.io/en/latest/

This should probably have a test...

dabeaz commented 5 years ago

This feels wrong to me. Why does Curio need to add an import dependency for this? Wouldn't a library trying to "sniff out" something like this be able to simply look at the environment and figure it out?

njsmith commented 5 years ago

Architecturally, it's the natural way to do it because there are lots of io loops – potentially an unbounded set – and only one sniffio. So adding support for a new io library should be something under the control of the io library, not something that has to go through sniffio as a bottleneck.

It's also faster (some users might be calling sniffio on every entry to their API), and more accurate.

dabeaz commented 5 years ago

If there's only one "sniffio", then it should be part of the standard library.

njsmith commented 5 years ago

I agree!

But this would require the standard library to admit that there are asyncio alternatives and that some people might use them, which, well, good luck...

dabeaz commented 5 years ago

Aren't you a core-dev? Commit first. Ask questions later ;-).

Is there really no other way to do this? To me, this sort of feels like that thing where you've got some giant inheritance hierarchy and all of the sudden some 7th generation twice-removed child class needs a weird new method added to it and that method ends up infecting the whole code base, eventually making its way up to the top-level base class.

CreatCodeBuild commented 5 years ago

Why do you even want to switch the async io lib at the first place? If you are a lib author, don't you want to only use async/await syntax while not importing any specific libs at all? If you are an application author, don't you want to decide on only one io lib?

dabeaz commented 5 years ago

I think part of my hesitation here is that Python has a long history of defining standardized APIs for things. For example, the database API or WSGI. If you've got code that's been written to conform to WSGI, you're not really supposed to care what WSGI framework it runs on.

This idea of async applications querying the underlying async library and then adapting its code base accordingly just feels weirdly off to me. If interoperability is a goal, then standardize an API layer for whatever is going on and publish it as a standard. Someone can then choose to make a Curio adapter for it if it's appropriate to do so.

njsmith commented 5 years ago

If you are a lib author, don't you want to only use async/await syntax while not importing any specific libs at all?

That would be nice, but the reason there are multiple async libs is that they disagree about how to express things like concurrency, cancellation, and IO. And it turns out that there are very few interesting libraries that want to use async/await but that don't need any concurrency, cancellation or IO :-).

If interoperability is a goal, then standardize an API layer for whatever is going on and publish it as a standard. Someone can then choose to make a Curio adapter for it if it's appropriate to do so.

So let's say we have a standard API layer, and it has several adapters, including one for Curio. How does it know which adapter to use? If only there were a simple way to figure out which async library was in use. That would be the key enabler to let people start designing standard API layers...

CreatCodeBuild commented 5 years ago

@njsmith Do you have a real use case at the moment? Or are you only having a general discussion here? A concrete example will help.

You are right, concurrency and cancellation of different IO libs are quite incompatible. But, I feel it will stay true even if Curio has sniffio support.

If you are a lib author and you really want to decouple from the io lib, the best bet for now might be defining your own concurrency interfaces in your lib and writing adapters in the application to bridge Curio or whatever io lib and your lib. Dependency injection, if you will.

smurfix commented 5 years ago

There is at least one real and non-trivial usecase for sniffio, which is trio-asyncio. It needs to actively set the sniffio contextvar when it emulates asyncio – auto-detecting trio (which cannot be avoided) while you're really in asyncio context would be Bad.

Sure we can add sniffing logic for curio into sniffio (we do that for asyncio because it's in the core and predates sniffio, so we have to), but that hard-codes implementation details of curio into it. I don't think anybody wants that either.

imrn commented 5 years ago

From sniffio repository: https://github.com/python-trio/sniffio

sniffio: Sniff out which async library your code is running under

You're writing a library. You've decided to be ambitious, and support multiple async I/O packages, like Trio, and asyncio, and ... You've written a bunch of CLEVER code to handle all the differences. But... how do you know which piece of clever code to run?

It is clever trio again.

njsmith commented 5 years ago

@creatcodebuild Use cases on the consumer side include

imrn commented 5 years ago

Sounds like trio is trying to be viral like systemd.

"HyperIO is a asynchronous compatibility API that allows applications and libraries written against it to run unmodified on asyncio, curio and trio." : How does it do this?

dabeaz commented 5 years ago

If you were to ask me the question "what is programming?", I'd answer by saying that it's basically the problem of managing abstraction. Programming languages provide various features for abstraction such as naming (variables), functions, and classes. You also have type checking, various design patterns, and so forth. One of the things that makes abstraction work is the ability for all of the features to be used together as a coherent whole. This is largely enabled by having all of these features operate within a unified model of program evaluation.

A challenge with async/await is that they doesn't operate within the standard evaluation rules. async functions can't compose with normal functions. Async objects can't easily interoperate with synchronous objects. The "What Color is Your Function" (http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/) post covers the challenges of this situation pretty well. However, even classic computer science texts such as SICP have extensive discussion about different program evaluation models and the challenges that arise if you try to mix them together (e.g., trying to combine applicative order evaluation with lazy evaluation together in the same program).

One of the major conceptual barriers to understanding asyncio is that it mixes two two evaluation models together in the same library. Parts of asyncio involve normal synchronous functions (e.g., protocols, callbacks, etc.) whereas other parts other involve coroutines. The challenges of working in this environment are typified by all of the functions related to carrying the event loop around (although great strides have been made in eliminating that requirement in user applications).

Curio was created with a specific design goal of putting all async functionality into a unified evaluation model (although it's mostly described in terms of "environments" in the Curio docs). By doing this, it was envisioned that the standard techniques of abstraction could be applied in this new environment. In a nutshell, async/await is the interface. You build all of your abstractions solely around that. As long as everyone agrees upon this evaluation model, it becomes possible to employ all of the usual software abstraction techniques (layering, adapters, etc.). This was vision I was trying to explore in my PyOhio 2016 talk (https://www.youtube.com/watch?v=Bm96RqNGbGo). It's the kind of thing that would more easily allow library code to adapt to different async libraries if they so choose to do so. In some sense, the choice of backend could become largely immaterial.

And yet, here we are---people now having to figure out how to sniff out the I/O library that they're using for some reason. It feels no different than having to carry the event loop around as an object everywhere. The benefits of having all async code exist within a common evaluation environment, unrealized.

smurfix commented 5 years ago

@imrn It does this by selecting a common subset of abstractions common to all asyncio models (sleep, read/write, open TCP socket, …), figuring out which backend is currently in use (right now without employing sniffio, an approach that doesn't work in all circumstances) and then dispatching to the concrete implementation.

smurfix commented 5 years ago

@dabeaz That unifying promise was never realized. Asyncio has heaps of sync callbacks. Curio has an incompatible mainloop. Trio has nurseries with their special, not-really-asyncio-compatible challenges. At least hyperio tries to bridge the gaps by implementing a least-common-denominator interface on top of all three of them, and trio-asyncio tries to run one on top of the other as well as possible.

Don't forget that asyncio has a history, stemming from the fun idea that "yield from" can be co-opted to be a poor man's coroutine management system. Rich systems have been build on top of that, but nobody (before curio and trio, anyway) had the guts to replace the poor man's threadbare trousers …

Asyncio seems to move towards a more unified set of abstractions, as it will implement task groups and the ability to run a fully-async protocol stack in 3.8, but in the meantime we'll have to deal with the fact that there are three semi-compatible systems which libraries (or shims like hyperio) need to adapt to.

dabeaz commented 5 years ago

asyncio gets a pass for its complexity. It was the first library and there is a whole learning process involved. I think there have been some nice developments for asyncio in Python 3.7 and its vision for 3.8 looks promising.

With respect to Curio/Trio, I don't think the benefits of encapsulating the async functionality in Curio was ever fully understood or appreciated. Instead, Trio went off exploring on some tangent involving cancellation points. Now its resulting API is a complex mix of async and sync operations that makes higher level abstractions much more complicated. When one talks about the difficulty of targeting both Curio and Trio, it's directly related to this. It's not something I intend to change so that difficulty may just be a fact of life for those two libraries.

dabeaz commented 5 years ago

I've thought about this further and I don't want to introduce a third-party import dependency to Curio just to report that Curio is running. This can already be determined from current Curio functionality:

from curio.meta import curio_running

if curio_running():
     # Do some curio thing
     ...

If the Python standard library grows some kind of functionality where one registers a running event loop, I'll revisit this.

dabeaz commented 5 years ago

I wanted to make one followup comment about something that's been bothering me here...

Although Curio and Trio have different points of view, it's really not necessary to make snippy comments about either library on pull requests. So, I'd kindly ask people not to do that.

My main objection to this specifically that I don't like introducing third party dependencies on libraries. I will consider alternatives that don't involve that.