openwallet-foundation / acapy

ACA-Py is a foundation for building decentralized identity applications and services running in non-mobile environments.
https://aca-py.org
Apache License 2.0
421 stars 515 forks source link

DID Rotate: handling out of order messages received after rotation #2839

Open dbluhm opened 8 months ago

dbluhm commented 8 months ago

Notes from #2816:

An ACA-Py specific concern I discovered in this process that probably deserves some discussion: invalidating the connection target cache when the DID Rotation is committed. As currently structured, we never really expect the connection targets for a connection to change after the protocol establishing it concludes. DID Rotation shakes that expectation up.

Right now, in my changes, I've added a clear_connection_targets_cache helper method to partially address this. It's partial because it will only work for ACA-Py when run as a single instance with the default in memory cache or if you're using Indicio's redis cache plugin for whole cluster shared caching.

I think it would be better for ACA-Py to use a value including both DIDs of the connection in the cache key as a more complete solution. The problem with this and why I've not done it on a first pass is that there would need to be some restructuring of how we use the Responder classes so that the send methods expect a whole ConnRecord and not just a connection_id (or individually a their_did and my_did, I suppose) so we can extract the DIDs from it. 9 times out of 10, I would say that connection record is already available to the caller of the send method and my hypothesis is that in the remaining 1 time out of 10, it will not dramatically impact performance of critical operations (it is not common for us to know the connection_id and not also have a connection record -- maybe only a few places in the Admin API might be like this).

We should investigate our options and determine if the proposed caching strategy would work as intended or if there are deeper solutions we'd need to look into.

swcurran commented 8 months ago

Related issue -- what are the pros/cons of making ALL deployments of ACA-Py multi-instance aware? E.g. that something like redis becomes core vs. an extension. How useful is it to have a "simple" version of ACA-Py in the real world?

dbluhm commented 8 months ago

Good question. I think there is value in the concept of an "agent in a box," to borrow a term @TelegramSam has used in the past. Having something that is easy to set up (minimal dependencies on other infrastructure you have to run yourself) and then expand upon over time has been a common on ramp for people, I think. Introducing redis doesn't necessarily break that; the demo we have in the repo already orchestrates more than one container -- what's one more? It would be nice to retain the option to not have an external dependency though.