Labelbox / labelbox-python

The data factory for next gen AI
https://labelbox.com
Apache License 2.0
111 stars 58 forks source link

Labelbox Client usage of `requests` Library doesn't support the `verify` argument so it ignores `REQUESTS_CA_BUNDLE` Environment Variable #1809

Open tommypkeane-gehc opened 6 days ago

tommypkeane-gehc commented 6 days ago

Describe the bug

In the execute() method of the Client class, the low-level requests.Request class is instantiated to create a requests.PreparedRequest, but this bypasses the requests.merge_environment_settings() method.

This means that Client.execute() will not collect and use the REQUESTS_CA_BUNDLE environment variable if present, and will result in SSL errors.

To reproduce

  1. Use method client_obj = labelbox.Client(api_key="...")
  2. Pass parameters client_obj.execute(...)
  3. See error: labelbox.exceptions.NetworkError: HTTPSConnectionPool(host='api.labelbox.com', port=443): Max retries exceeded with url: /_gql (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1006)')))

Expected behavior

The REQUESTS_CA_BUNDLE if set properly should allow the SSL certificate chain to be verified through a proxy or other deferred networking path where a custom certificate chain is needed.

Screenshots / Videos

N/A

Environment (please complete the following information

Additional context

N/A

Proposed Fix

In the Client.execute(...) method at the lines:

https://github.com/Labelbox/labelbox-python/blob/45109d855c32796a309eae23a9dc53750c3ae9a5/libs/labelbox/src/labelbox/client.py#L223-L225

the update could be:

            prepped: requests.PreparedRequest = request.prepare()

            response = self._connection.send(
                prepped,
                timeout=timeout,
                verify=os.getenv("REQUESTS_CA_BUNDLE", None),
            )

However, a more robust fix could be like this, though this wouldn't allow direct pass-through setting of the values, it would only rely on the Environment Variables:

            prepped: requests.PreparedRequest = request.prepare()

            req_settings: dict = self._connection.merge_environment_settings(
                url=None,
                proxies=None,
                stream=None,
                verify=None,
                cert=None,
            )

            response = self._connection.send(
                prepped,
                timeout=timeout,
                **req_settings,
            )
Gabefire commented 6 days ago

Hey @tommypkeane-gehc I am taking a look at this issue this seems similar to your other issue

tommypkeane-gehc commented 5 days ago

Hey @tommypkeane-gehc I am taking a look at this issue this seems similar to your other issue

Awesome, thanks!

In case it helps, I made my suggested edit to my local install of the labelbox package and it did resolve my SSL issues, so I can confirm that as long as the verify parameter of a requests.Session is set appropriately either by being exposed explicitly or by relying on the standardized REQUESTS_CA_BUNDLE environment variable, then the issue is resolved.

For reference, I had tracked this down from the requests package per the method used here: https://github.com/psf/requests/blob/0e322af87745eff34caffe4df68456ebc20d9068/src/requests/sessions.py#L562-L591