python-gitlab / python-gitlab

A python wrapper for the GitLab API.
https://python-gitlab.readthedocs.io
GNU Lesser General Public License v3.0
2.26k stars 654 forks source link

netrc auth overrides OAuth bearer token header #2425

Closed MHLut closed 1 year ago

MHLut commented 1 year ago

Description of the problem, including code/CLI snippet

When using the OAuth integration while having a .netrc file on the filesystem, the .netrc authentication overrides the OAuth bearer token.

Use the following code to trigger an error:

user_token = "long oauth token here"
client = gitlab.Gitlab(url=settings.GITLAB_BASE_URL, oauth_token=user_token)

# Trigger an HTTP 401 response (GitlabAuthenticationError):
client.auth()

_(Any call using the Client's http_xxx() method should fail; auth() is the easiest to use, though)_

Expected Behavior

Every request to the GitLab API via the Gitlab object should authenticate using the OAuth token provided by the user when initializing the client. It should use the Authorization header containing the OAuth token as a bearer.

Actual Behavior

Instead of using the bearer token from the passed headers kwarg, requests falls back to basic authentication using credentials from the netrc file.

Troubleshooting

So far, I've found out the following:

See also this relevant quote from Requests netrc documentation:

If no authentication method is given with the auth argument, Requests will attempt to get the authentication credentials for the URL’s hostname from the user’s netrc file. The netrc file overrides raw HTTP authentication headers set with headers=.

If credentials for the hostname are found, the request is sent with HTTP Basic Auth.

This might be fixed by explicitly passing an auth kwarg when using requests to make requests.

Specifications

nejch commented 1 year ago

Hi @MHLut, thanks for the report. This behavior is specific to requests the same way that defining https_proxy/no_proxy environment variables will affect behavior of requests being made. We're currently in the process of getting a more backend-agnostic codebase so we can support #1025 likely by switching from requests to httpx. So I'm a bit reluctant to add more low-level requests code considering there is an easy bypass:

requests sessions take a trust_env param to avoid this, so currently, you should be able to bypass this with a custom Session or simply with the following:

user_token = "long oauth token here"
client = gitlab.Gitlab(url=settings.GITLAB_BASE_URL, oauth_token=user_token)
client.session.trust_env = False
client.auth()

So it might just be a case of documenting that along with our advanced use case docs or FAQ I'd say (we should definitely warn the user about this gotcha and the config required). When we have our http backend code then, we could potentially just pass kwargs to the backend (trust_env in this case) and have the backend handle that there.

MHLut commented 1 year ago

@nejch Thank you for your response! Using client.session.trust_env = False added the bearer token to the request.

I do get an HTTP 401 invalid_token error now, but that might be unrelated to this issue.

nejch commented 1 year ago

Thanks @MHLut, keep us updated and maybe use gl.enable_debug() to see what headers are being sent to the server.

nejch commented 1 year ago

I haven't heard anything back @MHLut so I assume this works well, I've added some docs that should clarify this.

MHLut commented 1 year ago

@nejch Apologies, the proposed fix didn't work back in December, and due to time constraints, I had to go with a workaround.

Since the documentation fix seems more elaborate than the one mentioned above, I'll have to try that still.

uda commented 1 year ago

I encountered this too, took me time to discover it was impacted by netrc, since it was the final step in a CI step I opted for removing ~/.netrc before that, but there is a better solution.

According to the requests documentation, the only way to ignore netrc is to pass the auth param, which is quite simple, so instead of setting the headers value directly, we should have a few Auth classes based on the parameter passed just like we use the HTTPBasicAuth for username + password.

Update: I started working on a suggestion, would like input how to structure the solution since there is the _backends shim and currently we anyways rely only on requests and we have direct references to requests