sqlalchemy / dogpile.cache

dogpile.cache is a Python caching API which provides a generic interface to caching backends of any variety
https://dogpilecache.sqlalchemy.org
MIT License
242 stars 47 forks source link

AsyncCacheRegion #199

Open zzzeek opened 3 years ago

zzzeek commented 3 years ago

adapt the approach taken by SQLAlchemy in https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/util/_concurrency_py3k.py to provide for an AsyncCacheRegion frontend.

backends will as always present a "sync" interface that uses greenlets to adapt to the async backend. backends to start with:

CaselIT commented 3 years ago

should we create a package with sqlalchemy async adapted implementation? Or ctrl-c ctrl-v is better in this case?

zzzeek commented 3 years ago

yeah...I think for now we would vendor it (the latter option). dogpile.cache is python 3 only now so it can be more succinct. if we put out "sqlalchemy/greenlet_async" then we have to support that separately, would rather not go there yet.

antont commented 2 years ago

Hi - am using async SQLAlchemy with FastAPI happily, and was using dogpile.cache before ported the system to async otherwise. Have been struggling to port dogpile over to async, gotten some things to work partially by hacking around in the internals of SA ORM Session, Dogpile Region etc.

One problem is that event.listen is not implemented for async sessions, but am trying to hook it via the sync session etc.

Could you maybe hint at a the way to go, what would AsyncCacheRegion do? I'd have ok time to work on this on coming days, and pretty well versed with async Python in general, just not familiar with SA code from before so it's been a lot to digest.

zzzeek commented 2 years ago

Hi - am using async SQLAlchemy with FastAPI happily, and was using dogpile.cache before ported the system to async otherwise. Have been struggling to port dogpile over to async, gotten some things to work partially by hacking around in the internals of SA ORM Session, Dogpile Region etc.

One problem is that event.listen is not implemented for async sessions, but am trying to hook it via the sync session etc.

this is specific to SQLAlchemy, so when using event.listen with the asyncsession, follow the guidelines at https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html#using-events-with-the-asyncio-extension

then, when you are inside the event handler, suppose you are using aiomemcache or something like that. no problem, you can call out to asyncio methods inside the event handler using either the connection-bound run_async() method, documented at https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html#using-awaitable-only-driver-methods-in-connection-pool-and-other-events, or more directly, and we havent documented this yet, you can run async defs using await_only, just like:

from sqlalchemy import event
from sqlalchemy.util import await_only

engine = create_async_engine(...)
session = AsyncSession(engine)

@event.listens_for(session.sync_session, "before_flush")
def evt(sess, context, objects):
    cache_data = await_only(my_async_cache.get("some key"))

Could you maybe hint at a the way to go, what would AsyncCacheRegion do? I'd have ok time to work on this on coming days, and pretty well versed with async Python in general, just not familiar with SA code from before so it's been a lot to digest.

So this is a much bigger job and for anyone, requires a lot of cognitive work, like more than I have for the time being for sure :). But the general idea is that it would look a lot like AsyncSession. That is, it has all the methods that Region has, with all the "awaitable" ones set up as "async". then the body of each method calls out to the "sync" method on Region, making use of greenlet_spawn for all IO blocking calls. see https://github.com/sqlalchemy/sqlalchemy/blob/3b4d62f4f72e8dfad7f38db192a6a90a8551608c/lib/sqlalchemy/ext/asyncio/session.py#L188 for an example.

antont commented 2 years ago

Thanks a lot - sorry for being unclear, I should have reported what had learned so far, quick comments there.

Also, the solution I was using for sync, and have tried to port to async now, is the CachingQuery from https://docs.sqlalchemy.org/en/14/_modules/examples/dogpile_caching/caching_query.html

this is specific to SQLAlchemy, so when using event.listen with the asyncsession, follow the guidelines at https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html#using-events-with-the-asyncio-extension

Understood, I actually got some success hooking _do_orm_execute to asyncsession.sync_session.

then, when you are inside the event handler, suppose you are using aiomemcache or something like that. no problem, you can call out to asyncio methods inside the event handler using either the connection-bound run_async() method, documented at https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html#using-awaitable-only-driver-methods-in-connection-pool-and-other-events, or more directly, and we havent documented this yet, you can run async defs using

Right, the connection binding in ORM Session is where I got lost, at _connection_for_bind, I think because it tried to open a sync connection even though I was using AsyncSession elsewhere. It seemed that ORM Session would have async support somehow nowadays, but wasn't sure whether I was on the right track there at all.

So this is a much bigger job and for anyone, requires a lot of cognitive work, like more than I have for the time being for sure :). But the general idea is that it would look a lot like [AsyncSession]

Makes sense - I'll give another shot in a few days, am doing some other tasks first but return to this later.

CaselIT commented 2 years ago

@zzzeek what's your appetite on publishing a sqlalchemy_async_thingy package with the async and proxy stuff so that they can be reused in dogpile (and I guess 3rd party packages)?

zzzeek commented 2 years ago

@zzzeek what's your appetite on publishing a sqlalchemy_async_thingy package with the async and proxy stuff so that they can be reused in dogpile (and I guess 3rd party packages)?

my appetite for what I think we're talking about here, would be that we add some additional async examples into https://docs.sqlalchemy.org/en/14/orm/examples.html#module-examples.dogpile_caching and that would be it.

CaselIT commented 2 years ago

we're talking about here,

I think it would mainly be module sqlalchemy.util._concurrency_py3k and maybe the proxy thing we are changing here https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3771

but it's probably less work to just copy that file in dogpile, since it's not something that changes a lot.

zzzeek commented 2 years ago

the important thing about the dogpile example is that it's an example. people who use it are compelled to read the source code and take on at least some degree of responsibility for it. if we add new API then we have to maintain it.

a76yyyy commented 3 weeks ago

I'm interested in dogpile.cache, are there any examples of using dogpile.cache in an asynchronous way?

CaselIT commented 2 weeks ago

yeah...I think for now we would vendor it (the latter option). dogpile.cache is python 3 only now so it can be more succinct. if we put out "sqlalchemy/greenlet_async" then we have to support that separately, would rather not go there yet.

Since there is now such package, maybe dogpile it could just use it