Closed copdips closed 2 days ago
hi, thanks for raising this question. From an initial review, it looks to be working as intended.
X is by default 4 hours, but enter session policies: by configuring session policies, you can define a more customized lifetime validity for your sessions, e.g. you can consider them invalid even after 10 minutes if you wish. Or 1 hour.
CLIENT_SESSION_KEEP_ALIVE
is by default False
When is the keep alive signal (we call it heartbeat) sent: connection creation time + CLIENT_SESSION_KEEP_ALIVE_HEARTBEAT_FREQUENCY, which is by default 1 hour. So if you connect at 12:00:00 UTC, then you should expect the first heartbeat (call to /session/heartbeat
by the Python driver) at around 13:00:00 UTC. You can configure the frequency between 900s (15m) - 3600s (1h, default)
Hope this answers the questions.
@sfc-gh-dszmolka, thanks for your reply !
I still have a few questions about this topic:
Does this mean that with all the default settings, a user must generate a new OAuth token after 4 hours
and recreate the engine
using engine = Engine(URL(..., token=oauth_token, "authenticator": "oauth"))
?
If yes, based on my tests, I was still able to query Snowflake with the initial engine
object even after 5 hours. Furthermore, since CLIENT_SESSION_KEEP_ALIVE
defaults to False
, how was the query still able to executed? Could you clarify this behavior? Perhaps I made a mistake in my tests, I’ll conduct additional tests in the coming days.
When exactly does the heartbeat start being sent? One of my use cases involves using this engine
inside an API. I create the engine
object during API startup, but the engine.connect()
call happens in each API endpoint, which might occur 10 hours or more after the API startup. I’d like to understand when the heartbeat starts and stops being sent in this scenario. Does it begin right after engine = Engine(URL(...))
or after engine.conenct()
If we use plain text username and password during engine creation, does it also generate an authenticated token
underneath, or does it send the original password with each query?
Do you have any best practices for refreshing the OAuth2 token?
Here’s a bit of background:
We’re developing a Snowflake utility that uses Azure Service Principal (client ID and client secret) to generate an OAuth2 token with a 65-minute lifetime. The credential object (from azure.identity import ClientSecretCredential
) provides a get_token()
method to renew the token if it has expired, or to return the current token if it is still valid. My goal is to integrate this functionality with Snowflake SQLAlchemy in such a way that users of our utility won’t need to worry about token refreshes. The utility should handle everything internally.
Thank you for providing the background here. Looks like this Issue is starting to be a bit more than an actual bug or enhancement request filed against the snowflake-sqlalchemy library, and as I see it, it doesn't even deal with this library but more likely designing a solution based on OAuth. Which is not a problem, but maybe not the best forum to get all the infos here. But let's see what I got for you.
See more on how OAuth in Snowflake works:
This is not really related to this library.
python_connector.log
(in same directory where the program is executed) and confirm that/session/heartbeat
requests were sent during this 5 hour of total inactivity
If no queries are sent, no heartbeat is sent (CLIENT_SESSION_KEEP_ALIVE=False
), and you did not configure a session policy which would allow for a session to be authenticated for 5 hours or more, then please file a new issue with the reproduction steps because this is unexpected(or, if CLIENT_SESSION_KEEP_ALIVE=False
yet the connector still sending /session/heartbeat
requests every hour, also file a new issue please because that is unexpected)
engine.connect()
+ 1 hour (with default or unspecified CLIENT_SESSION_KEEP_ALIVE_HEARTBEAT_FREQUENCY
) (reference)
as mentioned previously, the authentication method (e.g. user/pass, keypair, SSO, OAuth, etc.) and the authentication token are two independent things (i had a typo in my previous comment. it's not authenticatED token
but authenticatiON token
)
Thus a 1-hour valid OAuth token you mentioned is not recommended from a security best practices perspective, but it's up to you really.
When the Snowflake authentication token expires, then you'll need to reauthenticate.(somehow).
At this point, considering you building a new solution, if your company is a Snowflake customer (or the company who contracted you or your company), I highly recommend considering reaching out to your Snowflake Account Team and asking for a design and implementation consultation. This is a repository for handling issues and enhancements related to the snowflake-sqlalchemy library, and what you require is , as you see, is way beyond that. I'm trying my best to give guidance but it's by no means comprehensive and is more like a general guidance.
That's why I recommend involving people from Snowflake who can provide closer and more customized guidance; and your account team can help in that.
If you don't have any further questions, perhaps we can consider closing this Issue. If you do, let me know and I'll try my best to help.
Thank you for all the clarifications so far!
Before the issue is closed, I have one last question please 🙏 :
I’m not building a utility in parallel to snowflake-sqlalchemy
but rather on top of it. My utility is a high-level wrapper around snowflake-sqlalchemy
, providing interfaces like get_marketing_data()
, get_sales_data()
, etc., for internal company usage, with snowflake-sqlalchemy
working behind the scenes.
By default, once the Snowflake-specific authentication token expires after 4 hours
, reauthentication is required. Assuming I have a new valid Azure OAuth token, what is the recommended way to provide this new token to snowflake-sqlalchemy?
Should I:
engine
object with the new OAuth token, like engine = Engine(URL(..., token=ouath_token, "authenticator": "oauth"))
, this is straightforward but requires recreating the engine from scratch. engine
object have a method to accept the new OAuth token and handle the reauthentication process ? maybe with self._oauth_token ? Currently, I'm exploring a solution using SQLAlchemy's event listener. It seems feasible, but I’m still unsure how to reset the OAuth token when the relevant event is triggered.after a brief consultation with my colleagues, they advised that we saw users can define a pool engine which always has a valid connection, and recreates said connection if it expires. engine
created something like this, with the pool:
example_engine = create_engine(
URL(account=...), #usual stuff
pool_size=10, # The size of the pool to maintain
max_overflow=20, # The number of connections to allow beyond pool_size
pool_timeout=30, # How long to wait before giving up on getting a connection from the pool
pool_recycle=3600, # How long to keep connections before recycling (in seconds)
echo=True # Enable logging of all SQL commands
)
then using do_connect event to modify how connection parameters are created
@event.listens_for(example_engine, 'do_connect')
def receive_do_connect(dialect, conn_rec, cargs, cparams):
cparams["token"] = get_token()
Hope this helps providing inspiration for the implementation
Great, thanks a lot for the help! I think also using an event listener should be the final solution.
I'm closing this issue and will drop a comment if anything doesn't work.
What is the current behavior?
I have an one hour life time oauth token, and I create the engine by
engine = Engine(URL(..., token=ouath_token, "authenticator": "oauth"))
, whereURL
isfrom snowflake.sqlalchemy import URL
It seems that as long as I don't dispose the engine, I can always query Snowflake use the same engine object.
What is the desired behavior?
Snowflake should refuse to process queries after 1 hour.
How would this improve
snowflake-connector-python
?First, could you please confirm whether it's
CLIENT_SESSION_KEEP_ALIVE
that prevents an OAuth token from expiring or if the token does expire but the user session remains valid ?Additionally, is
CLIENT_SESSION_KEEP_ALIVE
set toTrue
by default, as I dont't specifiy it during the engine creation.By the way, could you please also confirm whether the keep alive signal is sent as soon as the engine is created by
engine = Engine(URL(...))
, or if it only starts after calling theengine.connect()
, and when it stops to send the signal ? Thanks.