yanyongyu / githubkit

The modern, all-batteries-included GitHub SDK for Python, including rest api, graphql, webhooks, like octokit!
https://yanyongyu.github.io/githubkit/
MIT License
177 stars 25 forks source link

Excessive Token Exchange Operations in OAuth Implementation #44

Closed mnixry closed 2 months ago

mnixry commented 1 year ago

Description

There is an issue in the OAuth-related login implementation. After using OAuth to authenticate users, a token exchange operation is performed for each request. This is mainly caused by the OAuthWebAuth class being instantiated before every request.

Problems

In theory, we can refer to the AppAuth implementation and cache the Access Token after a single exchange to avoid multiple authentications. However, in the OAuthWebAuthStrategy implementation, the only unique parameter that can identify the user is the code attribute. Since code is a temporary credential, it is not suitable for use as a cache key.

Additionally, obtaining the unique user ID as a cache key after each Exchange process in the Auth class is not feasible. The Auth class is instantiated each time, and there is no suitable storage location for the obtained unique user ID. Moreover, due to the immutability of the ...Strategy class, we cannot write the acquired user credentials into the Strategy class.

Proposed Solutions

At present, we have two possible solutions:

  1. Refactor the OAuthAppAuthStrategy's as_web_user method. Complete the exchange process directly within this method. Pros: This approach is very intuitive. Cons: It relies on the github object, which goes against the design purpose.
@dataclass
class OAuthAppAuthStrategy(BaseAuthStrategy):
    """OAuth App Authentication"""

    client_id: str
    client_secret: str

    def as_web_user(
        self, github: "GitHubCore", code: str, redirect_uri: Optional[str] = None
    ) -> "OAuthWebAuthStrategy":
        exchange_auth = _OAuthWebFlowExchangeAuth(
            github, self.client_id, self.client_secret, code, redirect_uri
        )
        with github.get_sync_client() as client:
            client.auth = exchange_auth
            response = client.get("/user")
            response.raise_for_status()
        user_id = response.json()["id"]
        access_token, expire_time = exchange_auth.access_token_info
        refresh_token, refresh_token_expire_time = exchange_auth.refresh_token_info
        assert access_token is not None
        return OAuthWebAuthStrategy(
            self.client_id,
            self.client_secret,
            user_id,
            access_token,
            refresh_token,
        )

    def get_auth_flow(self, github: "GitHubCore") -> httpx.Auth:
        return httpx.BasicAuth(self.client_id, self.client_secret)
  1. Establish a multi-level caching in ...Auth class: create a code-to-userid cache mapping, and then create a userid-to-token cache mapping. When the code does not exist in the cache, perform the exchange process first, and write the userid:token cache mapping. If it exists, obtain the userid and then get the token. Also, add an API that can directly obtain the token auth through the userid in the cache. Pros: This approach is compatible with the existing code. Cons: The implementation and usage are more complex and hard to understand.

Please feel free to comment about this issue.

yanyongyu commented 1 year ago

For the second solution, the code is one-time-use ticket. The application may not use the code twice to get the user session. Instead, we may need to expose a new method to allow us to get current user's access_token. This may be similar to what https://github.com/octokit/auth-oauth-user.js does.