Closed dom111 closed 1 week ago
Hi @dom111!
Thanks for asking about this, and for your kind remarks about the library!
It's true that the library currently only supports one kind of authentication. But it would be good to support other kinds.
I don't fully understand how token based authentication works with Postgres. So I don't really know what to suggest as a solution. Is there some documentation about this online?
If you would like to have a quick meeting to go over what you are looking to achieve, that would be fun.
All good wishes,
John
PS I found this (but it doesn't mention "token-based"): https://www.postgresql.org/docs/current/client-authentication.html
I also found this (which does mention "token-based" but it's about pgadmin): https://www.postgresql.org/about/news/pgadmin-4-v615-released-2531/
Hi @johnbywater, I am working with @dom111 on a project using pyeventsourcing, I second his comments about the library, it's been great to work with!
Our use case is to authenticate using AWS IAM. This essentially just involves using a short-lived token as the database password, meaning connections which take place after the token has expired will fail to authenticate.
What we ideally need is a mechanism to provide the password dynamically on each connection.
We have some options for working around this such as extending the PostgresConnectionPool
, PostgresDatastore
, Factory
etc, or creating custom Recorder
classes which proactively check and refresh the token before any operation and update the password on the pool instance. None of it feels especially neat though.
However If, for example, we could register a function with the connection pool which provides credentials on each connection, or specify the ConnectionPool
class via the Factory
in the same way you can with the aggregate recorder etc, it feels a bit cleaner.
So if there's a better way to do this that we've missed, would be great to hear it! But if not and you're open to a change in the library. we're happy to put together a PR, but keen to get some guidance on the best approach.
Thanks for your help!
Hi John, thanks for the prompt reply!
Apologies for the confusion around the token Auth, our specific use case is IAM Auth with AWS which provides an expiring token for given credentials that are passed in as the password (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.Connecting.html) the main concern we had was around using the pool and having a new connection be attempted with a cached password (as the password value from env is stored on class instantiation).
I feel like some sort of password provider mechanism that could be passed in might work, but I don't know if that's in keeping with the design goals of the lib. A solution where the Postgres* classes can be dynamically set to alternative implementations might work well (a pattern used elsewhere, but presumably not for the Postgres module because it's provided as a complete solution as-is).
I'm currently looking at a few solutions as mentioned above, as well as a quick and dirty approach of overwriting the _create_connection
method to set self.password
to a token before calling super()._create_connection()
which appears to be working although is brittle.
It's been great to have something robust and would be happy to help get something in place in the library to share with any future users.
Hope that doc helps clarify the use-case!
Thanks
Dom
@dom111 @krixon thanks for the replies!
So, from where do you get the new token? for how long does a token last? what happens when you try to connect using a token that has expired? and what happens if you make a request after a token has expired using a connection that was created with a valid token? would it be better to always check that a token is valid before every request, or might a token expire after the check and before the request is made? so would it better to detect when a request fails due to a token expired error and then reconnect with a new token and retry the request? I'm guessing the last option would work best?
Regarding the scope of this module, the intention is to support what users need. Using IAM seems like a very valid use case. I guess we could add an environment variable, such as POSTGRES_PASSWORD_TOKEN_TOPIC
, that is expected to resolve to a callable, from which token can be obtained. Setting this could then activate the error-detection-reconnect-retry behaviour? More generally speaking, more flexibility in overriding components is definitely possible. At the moment, I admit, it's a little bit awkward to introduce alternative components, such as alternative recorders, because you need to subclass the factory and define a new persistence module.
from where do you get the new token?
We call out to AWS via an SDK to obtain new tokens, pretty much as in this linked example.
for how long does a token last?
15 minutes.
what happens when you try to connect using a token that has expired?
The connection is refused as if you had supplied an incorrect password.
what happens if you make a request after a token has expired using a connection that was created with a valid token?
The token is only used to establish the connection, after which the connection can remain open beyond the token's lifetime. So essentially, every time _create_connection
is called on the pool, we'd need to provide the current token.
so would it better to detect when a request fails due to a token expired error and then reconnect with a new token and retry the request? I'm guessing the last option would work best?
Yes this feels like the most useful approach. The alternative would be to proactively call the registered function to obtain the password before each connection (rather than waiting for it to fail), but that implies the function knows whether a new token is actually required or not, which may not always be the case. Trying to connect, failing, refreshing the password, trying once more and finally giving up on a second failure seems simplest.
I just added this:
As an alternative to setting a fixed password in POSTGRES_PASSWORD, you can set POSTGRES_GET_PASSWORD_TOPIC to indicate the topic of a function that returns passwords. This function will be called each time when creating new database connections. This variable supports using database services authenticated with Identity Access Management (IAM), sometimes referred to as token-based authentication, for which the password is a token that is changed perhaps every 15 minutes. If POSTGRES_GET_PASSWORD_TOPIC is set, the POSTGRES_PASSWORD variable is not required and will be ignored. The value of this variable should be resolvable using resolve_topic() to a Python function that expects no arguments and returns a Python str.
https://eventsourcing.readthedocs.io/en/latest/topics/persistence.html#postgresql-module
I've pre-released these changes as https://pypi.org/project/eventsourcing/9.3.0a1/. It's a pre-release so you may need to mention this version explicitly in order to install it.
Please note, this release uses the new psycopg3 package. Not much else has changed, except that the package is built with Poetry. The eventsourcing.postgres
module has been heavily modified to work closely with psycopg3. But you shouldn't notice any difference. However, if your project depends on psycopg
separately, then have it depend on psycopg[c]
for production use along with eventsourcing
. Alternatively, you can depend on eventsourcing[postgres]
which will do the same. For development, to avoid longer build times at the cost of slightly worse performance, you can either depend on psycopg[binary]
and eventsourcing
or eventsourcing[postgres_dev]
. These install options are documented here. The "extra" dependency options (eventsourcing[postgres]
and eventsourcing[postgres_dev]
) aren't new, but eventsourcing v9.3 depends on psycopg v3, and the install options for the psycopg
package have changed. I won't bore you here with the details of what's new in psycopg3, as you probably already know about it. However, the psycopg v3 is an almost total rewrite of the psycopg package, and we are using the pipelining functionality to append events, which improves performance in many cases, by avoiding network round trips. This is new code and has been tested by at least one other person :-)
Please could you try this out and let me know your thoughts? I wrote some tests to check the configuration of POSTGRES_GET_PASSWORD_TOPIC and the associated functionality for using a "get password" function. Connection attempts will continue for POSTGRES_CONNECT_TIMEOUT (default is 5s), with each attempt calling the "get password" function. So if you don't want to call out to AWS via an SDK several times a second, for example when starting up, then you could implement some kind of time-based cache in your "get password" function, I guess an expiry time that is less than the connect timeout value would be sensible here. Let me know what you think?
@johnbywater This is great and looks to be exactly what we need. Thanks for working on this, and thanks for working on it so quickly! My focus is on another part of the project presently, but I'll get onto this ASAP and validate the mechanism, hopefully this week.
I've taken note of your comments so some sort of short-term caching should work perfectly.
Thanks again!
@dom111 @krixon did you have any further thoughts about this?
Hi @johnbywater, apologies for the delay, we've been focused on getting an MVP released so have been a bit snowed under the last couple of weeks! We're hoping to pick this up over the next few days so will report back ASAP. Thanks for your patience!
@johnbywater just an update that I'm looking into this presently. I'm hitting some weirdness after updating the lib, so I'm just looking into the details to make sure it's not just something we're doing wrong!
Hi @dom
hitting some weirdness
Did you have any more details about this that you can share? :-)
Hey @johnbywater, I got to the bottom of this (it turned out to be some self-inflicted weirdness on my local dev setup with volumes in docker-compose
) and I've changed most things (had to update our alembic and sqlalchemy config to use postgresql+psycopg
(rather than omitting the psycopg
part altogether, or specifying psycopg2
).
My last issue to work through is an end-to-end test where we can prove data is written in (preferably, at least) our staging environment.
I appreciate that we don't want to block you on any releases so we are keeping this on our board.
Thanks again for this work!
Thanks @dom111 that's very helpful.
Hey @johnbywater,
Hope you're doing well! Just to confirm that this appears to work perfectly in my tests. Thank you again for your patience in this and apologies that it's taken a while for us to get this checked.
Please let me know if you need anything further (or have any more versions that you'd like to test) otherwise, we're excited for 9.3's release and getting this change out to production!
Thanks,
Dom
Aww thanks @dom111! That's really very helpful. I really appreciate the efforts you have made to test this. Thanks for letting me know it's all working!
Hi there,
Firstly, thanks for the hard work of the team on this well-documented and richly-featured package!
We're currently utilising pyeventsourcing in a project and are investigating how to best implement token-based passwords for the Postgres persistence module.
Current options we've considered are:
eventsourcing.postgress
, overwriting the minimum set of classes via inheritanceeventsourcing.postgress
, overwriting the minimum set of classes using decorators and magic methods to proxy__getitem__
requestsPOSTGRES_PASSWORD
(won't work with short-lived tokens and reconnections)do_connect
event to replace the password key (need a bit more investigation around this)I'd be happy to contribute any fixes you'd find useful back to the project, so any guidance on alternatives, or preferences from the above would be great.
Thanks!
Dom