jupyterhub / oauthenticator

OAuth + JupyterHub Authenticator = OAuthenticator
https://oauthenticator.readthedocs.io
BSD 3-Clause "New" or "Revised" License
414 stars 366 forks source link

Bitbucket auth broken #676

Closed huwf closed 1 year ago

huwf commented 1 year ago

Bug description

I think this might be a bug in Bitbucket rather than OAuthenticator, but I'm not sure, and it may be specific to my setup. The requests to get user information and workspace permissions get a 401 response, handled as a 500 in Jupyterhub when using BitbucketOAuthenticator.

This appears to be a problem with the way the Authorization header is being sent/handled, because including the token in the querystring appears to work. Setting OAUTH2_USERDATA_REQUEST_TYPE=url in the environment makes it work for token_to_user but _fetch_user_teams still breaks

Expected behaviour

Setting Authorization header should successfully authenticate with bitbucket cloud

Actual behaviour

401 status returned

How to reproduce

  1. Setup OAuthentiator as the setup below
  2. Try to login with Bitbucket

    Your personal set up

    jupyterhub 4.0.2 oauthenticator 16.0.5

c.JupyterHub.authenticator_class = "bitbucket"
c.OAuthenticator.oauth_callback_url = "http://localhost/hub/oauth_callback"
c.OAuthenticator.client_id = "CLIENT_ID"
c.OAuthenticator.client_secret = "SECRET"
c.OAuthenticator.allowed_users = ["user1", "user2"]
c.BitbucketOAuthenticator.allowed_teams = {"teamname"}
c.SubBitbucketAuthenticator.scope = ["email", "account", "repository", "workspace"]

Workaround

Currently I have setup OAUTH2_USERDATA_REQUEST_TYPE=url in the environment to authenticate on the user API, and am subclassing the _fetch_user_teams method as below to restrict the workspace.

I feel it might be better solved by putting userdata_token_method into httpfetch if appropriate and in subclasses setting self.userdata_token_method = url in the bitbucket class.

class SubBitbucketOAuthenticator(oauthenticator.BitbucketOAuthenticator):
    async def _fetch_user_teams(self, access_token, token_type):
        """
        Get user's team memberships via bitbucket's API.
        """
        try:
            app_log.debug("Trying usual bitbucket OAuth method...")
            return await super()._fetch_user_teams(access_token, token_type)
        except HTTPClientError as e:
            if e.code != 401:
                raise
        app_log.debug("Trying to authenticate with with access_token in the url")
        headers = self.build_userdata_request_headers(access_token, token_type)
        del headers["Authorization"]

        next_page = url_concat(
            "https://api.bitbucket.org/2.0/workspaces", {"access_token": access_token}
        )

        user_teams = set()
        while next_page:
            resp_json = await self.httpfetch(next_page, method="GET", headers=headers)
            next_page = resp_json.get('next', None)
            user_teams |= {entry["name"] for entry in resp_json["values"]}
        return user_teams
welcome[bot] commented 1 year ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

consideRatio commented 1 year ago

If you install oauthenticator 15.1.0, do you end up with issues then as well? I'm trying to conclude if this is a regression in 16.0.0 that did a lot of refactoring or a bug that has been around since before.

huwf commented 1 year ago

Using 15.1.0 I don't get an error until _check_membership_allowed_teams, where I get a 403 from JH. Not sure if this is correct, as far as I know I should be in the team I am checking for. This might indicate that both calls are giving errors, because (I think) 16.0.5 team membership is only checked for if user authentication fails. However, I'd expect a KeyError here so I don't know. Logs below

[I 2023-08-22 11:17:48.043 JupyterHub oauth2:102] OAuth redirect: 'http://localhost/hub/oauth_callback'
[D 2023-08-22 11:17:48.043 JupyterHub base:587] Setting cookie oauthenticator-state: {'httponly': True, 'expires_days': 1}
[W 2023-08-22 11:17:49.721 JupyterHub bitbucket:100] huwf not in team allowed list of users
[W 2023-08-22 11:17:49.722 JupyterHub base:843] Failed login for unknown user
[W 2023-08-22 11:17:49.722 JupyterHub web:1869] 403 GET /hub/oauth_callback?code=CODE&state=STATE: Sorry, you are not currently authorized to use this hub. Please contact the hub administrator.
[D 2023-08-22 11:17:49.722 JupyterHub base:1371] No template for 403
[W 2023-08-22 11:17:49.735 JupyterHub log:191] 403 GET /hub/oauth_callback?code=[secret]&state=[secret] 1098.42ms
consideRatio commented 1 year ago

EDIT: this comment is irrelevant, I looked at the wrong REST API, it should have been https://developer.atlassian.com/cloud/bitbucket/rest/api-group-workspaces/#api-workspaces-get.


Looking at the request about teams, I see it documented here.

https://developer.atlassian.com/cloud/bitbucket/rest/api-group-workspaces/#api-user-permissions-workspaces-get

Nowhere there is a mention of being able to pass a query parameter like role=member directly. That makes me think at least one issue relates to that.

https://github.com/jupyterhub/oauthenticator/blob/f38a2d45fc30ea8301084825b0e619959b447048/oauthenticator/bitbucket.py#L61-L61

In [1]: from tornado.httputil import url_concat

In [2]: url_concat("https://api.bitbucket.org/2.0/workspaces", {"role": "member"})
Out[2]: 'https://api.bitbucket.org/2.0/workspaces?role=member'
consideRatio commented 1 year ago

I didn't do a very thorough check about this, but I can't tell whats wrong.

In the logs below, things seems to work out correctly with authentication, but the user is simply denied by not being authorized by being a member of a team etc.

[I 2023-08-22 11:17:48.043 JupyterHub oauth2:102] OAuth redirect: 'http://localhost/hub/oauth_callback'
[D 2023-08-22 11:17:48.043 JupyterHub base:587] Setting cookie oauthenticator-state: {'httponly': True, 'expires_days': 1}
[W 2023-08-22 11:17:49.721 JupyterHub bitbucket:100] huwf not in team allowed list of users
[W 2023-08-22 11:17:49.722 JupyterHub base:843] Failed login for unknown user
[W 2023-08-22 11:17:49.722 JupyterHub web:1869] 403 GET /hub/oauth_callback?code=CODE&state=STATE: Sorry, you are not currently authorized to use this hub. Please contact the hub administrator.
[D 2023-08-22 11:17:49.722 JupyterHub base:1371] No template for 403
[W 2023-08-22 11:17:49.735 JupyterHub log:191] 403 GET /hub/oauth_callback?code=[secret]&state=[secret] 1098.42ms

Note that I think the team names may be case sensitive and such, I'm not sure if those names really should be used by this authenticator - the names must be unique and I'm not sure the teams are unique. What kind of name did you provide?

The name key is read, but there is uuid and slug also where uuid certainly will be unique, but is the key's slug and/or name?

Below is a snippet from the REST API Docs describing the response to https://api.bitbucket.org/2.0/workspaces.

      "uuid": "<string>",
      "name": "<string>",
      "slug": "<string>",
huwf commented 1 year ago

Interesting, I think I used the slug, which I guess could be why I got 403 here, will have to check again on my work PC tomorrow. I think the slug would have to be unique, as it's part of the URL and it's referred to as an ID, see https://support.atlassian.com/bitbucket-cloud/docs/change-a-workspace-id/ but the name seems to be more of a display name. Can test that tomorrow.

There is actually a different flow in 15.1.0 compared to 16.0.5. In 15.1.0, it looks like if allowed_teams is specified, then the user has to be a member of that team as well as be in the list of allowed users.

In 16.0.5, if the user authentication is ok then team membership is never checked as it returns the value of super().check_allowed. The docstring describes this, so I assume the code is deliberate but it is different.

I had initially thought that the flow of 15.1.0 was how it would work, and team membership could be extra redundancy but being a required username OR in a team makes just as much sense.

huwf commented 1 year ago

Yes, to confirm, I successfully logged in with 15.1.0 when I set the team name to display name, so 15.1.0 is working correctly.

However, the name does not appear to be unique which suggests you should not use it in the authenticator. On my personal account, I was able to create a workspace with the same name as my work company workspace. I'm not sure if it matters or not, because the access token should be tied to the account which has the particular team and they would presumably not want to create multiple workspaces with the same display name.

However, the slug is indeed part of the URL and I was not allowed to create a workspace with the same name, so I think that it would be reasonable to use that instead. I think using the slug is more intuitive anyway, but that might just be my preference.

consideRatio commented 1 year ago

Okay so there is a clear security related bug with bitbucket oauthenticator about using a name that isn't unique. That should be fixed - its not acceptable. If the non-unique "display name" is used, anyone can proove membership to such display name and get authorized access to your jupyterhub.

Besides that, is there another bug when using oauthenticator 16.0.x about making requests to check membership of workspaces etc?

huwf commented 1 year ago

Yes, in 16.0.x requests to both /2.0/user /2.0/workspaces gets an empty response with 401 code with default setup.

Workaround for both is to putting the access token in the URL, which can be done by env variable for /2.0/user but needed a subclass for /2.0/workspace

consideRatio commented 1 year ago

Okay so we have two bugs to fix:

  1. security bug, require uniquness of name specified in allowed_teams config
  2. functional bug, make requests to /2.0/user and /2.0/workspaces function

With regards to the securit bug, the fix is to switch to slug instead of name.

With regards to the functional bug, I'm not sure, but I don't want us to pass the access token in a query parameter. Also, this isn't recommended by bitbucket themselves who sais they prefer the header option.

Since they suggest they should support passing info in the header, and since 15.1.0 also passes it via the header, I don't understand what has broken in 16.0.x compared to 15.1.0.

If you can figure out whats wrong and patch it while still using a header in the GET request, that would be great and a PR I'd see myself merging with a calm mind.

consideRatio commented 1 year ago

Yes, in 16.0.x requests to both /2.0/user /2.0/workspaces gets an empty response with 401 code with default setup.

I think the key debugging of relevance could be to understand how these requests are made in 15.1.0 vs 16.0.x, both go to the same URL, and in my understanding both were using the same headers etc - so whats the difference? Is the requests different?

huwf commented 1 year ago

I will have a look at the details later on. The difference I see is that in 15.1.0, the headers hardcodes Bearer in "Authorization": "Bearer {}".format(access_token), whereas in 16.0 it is "Authorization": f"{token_type} {access_token}" so my bet is that bitbucket gives a different token_type.

Another thing to think about is what I said about the different flow between 15.1 and 16.0, that the team membership is not checked if the user is in the list of allowed users. If it is deliberate, I think that should be documented more specifically that if set a user may be an allowed user OR a member of a team/workspace.

consideRatio commented 1 year ago

I think maybe it already is documented to some degree at least, I tried documenting this in the 16.0.0 changelog at least as a breaking change.

consideRatio commented 1 year ago

It appears that bitbucket returns bearer as token type, but requires it to be Bearer when used.

manics commented 1 year ago

The RFC says it should be Bearer https://datatracker.ietf.org/doc/html/rfc6750#section-2.1

consideRatio commented 1 year ago

@manics apparently it should be case-insentivite anyhow, see the comment in https://jira.atlassian.com/browse/BCLOUD-14135?focusedCommentId=3315945&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-3315945

consideRatio commented 1 year ago

Anyhow, they won't fix it, so, we should make it exactly "Bearer".

consideRatio commented 1 year ago

A third UX bug is that its called allowed_teams, when its really allowed_workspaces. Bitbucket has no concept of "teams" it seems, and we compare against the /workspaces API.

And, we don't configure BitBucketOAuthenticator's default scope to something that makes things work out of the box either...

manics commented 1 year ago

Anyhow, they won't fix it, so, we should make it exactly "Bearer".

I agree. Is there a case where the type isn't bearer for BitBucket? If not I'd program defensively and throw an exception if token_type.lower() != "bearer".