In our Airflow worker pod, we specify an env var REQUESTS_CA_BUNDLE. This leads to SAS Studio Flow operator failed to honor the extra field of Airflow Connection {"ssl_certificate_verification": false } to skip the cert verification.
As you can see, it confirmed TLS verification is turned off and even get the access token from SAS Logon Get oauth token. But it failed to talk to SAS Studio REST endpoint.
[2024-06-26, 17:08:58 UTC] {sas.py:52} INFO - TLS verification is turned off
[2024-06-26, 17:08:58 UTC] {sas.py:62} INFO - Creating session for connection named sas_default to host https://d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com/
[2024-06-26, 17:08:58 UTC] {sas.py:82} INFO - Get oauth token (see README if this crashes)
[2024-06-26, 17:08:59 UTC] {sas_studioflow.py:90} INFO - Generate code for Studio Flow: /Users/miadmin/TestFlow.flw
[2024-06-26, 17:08:59 UTC] {logging_mixin.py:188} INFO - Code Generation for Studio Flow without Compute session
[2024-06-26, 17:08:59 UTC] {taskinstance.py:441} ▼ Post task execution logs
[2024-06-26, 17:08:59 UTC] {taskinstance.py:2905} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 404, in _make_request
self._validate_conn(conn)
File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1060, in _validate_conn
conn.connect()
File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connection.py", line 419, in connect
self.sock = ssl_wrap_socket(
File "/home/sas/.local/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "/home/sas/.local/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "/usr/lib64/python3.8/ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "/usr/lib64/python3.8/ssl.py", line 1040, in _create
self.do_handshake()
File "/usr/lib64/python3.8/ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sas/.local/lib/python3.8/site-packages/requests/adapters.py", line 564, in send
resp = conn.urlopen(
File "/home/sas/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 801, in urlopen
retries = retries.increment(
File "/home/sas/.local/lib/python3.8/site-packages/urllib3/util/retry.py", line 594, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com', port=443): Max retries exceeded with url: /studioDevelopment/code (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sas/.local/lib/python3.8/site-packages/sas_airflow_provider/operators/sas_studioflow.py", line 91, in execute
code = _generate_flow_code(
File "/home/sas/.local/lib/python3.8/site-packages/sas_airflow_provider/operators/sas_studioflow.py", line 199, in _generate_flow_code
response = session.post(uri, json=req)
File "/home/sas/.local/lib/python3.8/site-packages/sas_airflow_provider/hooks/sas.py", line 112, in <lambda>
session.post = lambda *args, **kwargs: requests.Session.post( # type: ignore
File "/home/sas/.local/lib/python3.8/site-packages/requests/sessions.py", line 637, in post
return self.request("POST", url, data=data, json=json, **kwargs)
File "/home/sas/.local/lib/python3.8/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/home/sas/.local/lib/python3.8/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/home/sas/.local/lib/python3.8/site-packages/requests/adapters.py", line 595, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com', port=443): Max retries exceeded with url: /studioDevelopment/code (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sas/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 465, in _execute_task
result = _execute_callable(context=context, **execute_callable_kwargs)
File "/home/sas/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 432, in _execute_callable
return execute_callable(context=context, **execute_callable_kwargs)
File "/home/sas/.local/lib/python3.8/site-packages/airflow/models/baseoperator.py", line 400, in wrapper
return func(self, *args, **kwargs)
File "/home/sas/.local/lib/python3.8/site-packages/sas_airflow_provider/operators/sas_studioflow.py", line 124, in execute
raise AirflowException(f"SASStudioFlowOperator error: {str(e)}")
airflow.exceptions.AirflowException: SASStudioFlowOperator error: HTTPSConnectionPool(host='d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com', port=443): Max retries exceeded with url: /studioDevelopment/code (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)')))
[2024-06-26, 17:08:59 UTC] {taskinstance.py:1206} INFO - Marking task as FAILED. dag_id=MySASStudioFlowOperatorDAG, task_id=sas_studio_test_flow, run_id=manual__2024-06-26T17:08:55.695486+00:00, execution_date=20240626T170855, start_date=20240626T170858, end_date=20240626T170859
[2024-06-26, 17:08:59 UTC] {standard_task_runner.py:110} ERROR - Failed to execute job 6 for task sas_studio_test_flow (SASStudioFlowOperator error: HTTPSConnectionPool(host='d21670.ingress-nginx.miadmin-01-m1.irm.sashq-d.openstack.sas.com', port=443): Max retries exceeded with url: /studioDevelopment/code (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)'))); 14161)
[2024-06-26, 17:08:59 UTC] {local_task_job_runner.py:240} INFO - Task exited with return code 1
[2024-06-26, 17:08:59 UTC] {taskinstance.py:3498} INFO - 0 downstream tasks scheduled from follow-on schedule check
[2024-06-26, 17:08:59 UTC] {local_task_job_runner.py:222} ▲▲▲ Log group end
Root Cause
In the 1st REST call, it explicitly passed the boolean value verify to the request.post function. It works as expected.
There is a bug in Python request function for years. But still everyone is wasting hours for this overwritten issue. It is better that we fix it in our code or at least make two REST calls behave in a consistent way (either both fail or both succeed).
Problem:
In our Airflow worker pod, we specify an env var
REQUESTS_CA_BUNDLE
. This leads to SAS Studio Flow operator failed to honor the extra field of Airflow Connection {"ssl_certificate_verification": false } to skip the cert verification.As you can see, it confirmed
TLS verification is turned off
and even get the access token from SAS LogonGet oauth token
. But it failed to talk to SAS Studio REST endpoint.Root Cause
In the 1st REST call, it explicitly passed the boolean value verify to the
request.post
function. It works as expected.https://github.com/sassoftware/sas-airflow-provider/blob/b5527629e4592b0ba85abf6b5f77da2f058b4d06/src/sas_airflow_provider/hooks/sas.py#L83-L89
In the 2nd REST call, it didn't pass verify to the
request.*
function but ratherSession.verify
.https://github.com/sassoftware/sas-airflow-provider/blob/b5527629e4592b0ba85abf6b5f77da2f058b4d06/src/sas_airflow_provider/hooks/sas.py#L103-L121
There is a bug in Python request function for years. But still everyone is wasting hours for this overwritten issue. It is better that we fix it in our code or at least make two REST calls behave in a consistent way (either both fail or both succeed).