trinodb / trino-python-client

Python client for Trino
Apache License 2.0
311 stars 154 forks source link

Custom cache for oauth2 tokens #223

Open JeevansSP opened 1 year ago

JeevansSP commented 1 year ago

Hello, I have a flask app, which uses trino python client to query the data from my trino servers and I have enabled oatuh2 authentication against my azure active directory, currently, the token is caching per host, I need to have it cached per user who authenticates on the front end of my web application. I have implemented my custom cache below and it is working as per my use case, but I prefer not to change any private properties as it's not the best practice.

from  flask import session

class _CustomCache(_OAuth2TokenCache):
    """
    In-memory token cache implementation. The token is stored per user.
    """

    def __init__(self):
        self._cache = {}
    def get_token_from_cache(self, host: str) -> Optional[str]:
        userName=session['user']   
        return self._cache.get(userName)

    def store_token_to_cache(self, host: str, token: str) -> None:
        userName=session['user']  
        self._cache[userName] = token

temp=OAuth2Authentication()
temp._bearer._token_cache=_CustomCache()

conn=connect(

    host='******',
    port=443,
    auth=temp,
    http_scheme="https"  
)

cursor=conn.cursor()
hashhar commented 1 year ago

This seems to be side-effect of the fact that we cache tokens for the entire host instead of per-connection. i.e. once we stop sharing tokens across across connections this problem can be solved much easier by providing two types of caches:

that way for applications where per-user caching is needed you can create connections per user (also has the benefit o keeping things like session properties and additional config separate) with MEMORY caching mode and cache the connection object itself in your application per-user.

For places where it doesn't matter which user identity is used the MEMORY_SHARED cache can be used.

cc: @lukasz-walkiewicz @s2lomon for your ideas from when working on the OAuth2 impl in the JDBC driver.

mdesmet commented 1 year ago

Both options are actually already implemented.

If you install keyring, the token is cached over connections but indeed per host, serving the use case of CLI applications like dbt (what you call MEMORY)

The default is per instance of trino.auth.OAuth2Authentication, as long as you don't share it over connections, it would be per connection.

the only gap is that we currently don't expose the token. Note that also the token can change. Initially the token won't be there until the authentication is completed. So what we actually need is rather a subscription.

I think in the end it will come close to what the PR is providing a hook to provide a token, and process a token change.

mdesmet commented 1 year ago

Relevant related PR in Superset: https://github.com/apache/superset/issues/20300

mdesmet commented 1 year ago

cc: @leniartek