boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
9.03k stars 1.87k forks source link

Problem with SSL Validation #2686

Closed RobertP09 closed 3 years ago

RobertP09 commented 3 years ago

Describe the bug After deploying to prod I have started receiving an error with SSL Validation failed. boto3 is tagged at 1.15.2

Steps to reproduce Don't have steps to reproduce, it only occurs when i push my repo to production.

Expected behavior Previously the device was reaching out to endpoint: https://data.iot.us-west-2.amazonaws.com/things/xxx/shadow and retrieving the device information

RobertP09 commented 3 years ago

zenbath-chalice - ERROR - Caught exception for <function change_settings at 0x7fd9c460fca0> Traceback (most recent call last): File "/var/task/urllib3/connectionpool.py", line 699, in urlopen httplib_response = self._make_request( File "/var/task/urllib3/connectionpool.py", line 382, in _make_request self._validate_conn(conn) File "/var/task/urllib3/connectionpool.py", line 1010, in _validate_conn conn.connect() File "/var/task/urllib3/connection.py", line 411, in connect self.sock = ssl_wrapsocket( File "/var/task/urllib3/util/ssl.py", line 428, in ssl_wrap_socket ssl_sock = _ssl_wrap_socketimpl( File "/var/task/urllib3/util/ssl.py", line 472, in _ssl_wrap_socket_impl return ssl_context.wrap_socket(sock, server_hostname=server_hostname) File "/var/lang/lib/python3.8/ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "/var/lang/lib/python3.8/ssl.py", line 1040, in _create self.do_handshake() File "/var/lang/lib/python3.8/ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1124) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/var/task/botocore/httpsession.py", line 254, in send urllib_response = conn.urlopen( File "/var/task/urllib3/connectionpool.py", line 755, in urlopen retries = retries.increment( File "/var/task/urllib3/util/retry.py", line 506, in increment raise six.reraise(type(error), error, _stacktrace) File "/var/task/urllib3/packages/six.py", line 734, in reraise raise value.with_traceback(tb) File "/var/task/urllib3/connectionpool.py", line 699, in urlopen httplib_response = self._make_request( File "/var/task/urllib3/connectionpool.py", line 382, in _make_request self._validate_conn(conn) File "/var/task/urllib3/connectionpool.py", line 1010, in _validate_conn conn.connect() File "/var/task/urllib3/connection.py", line 411, in connect self.sock = ssl_wrapsocket( File "/var/task/urllib3/util/ssl.py", line 428, in ssl_wrap_socket ssl_sock = _ssl_wrap_socketimpl( File "/var/task/urllib3/util/ssl.py", line 472, in _ssl_wrap_socket_impl return ssl_context.wrap_socket(sock, server_hostname=server_hostname) File "/var/lang/lib/python3.8/ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "/var/lang/lib/python3.8/ssl.py", line 1040, in _create self.do_handshake() File "/var/lang/lib/python3.8/ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1124) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/var/task/chalice/app.py", line 1135, in _get_view_function_response response = view_function(function_args) File "/var/task/chalicelib/utils/auth.py", line 124, in func_wrapper res = func(kwargs) File "/var/task/chalicelib/utils/auth.py", line 48, in validate_invoke return func(*args, *kwargs) File "/var/task/chalicelib/controllers/device.py", line 227, in change_settings return device.change_shadow(device_blueprint.current_request.json_body) File "/var/task/shine/services/device.py", line 86, in change_shadow response = iot_data_plane.update_thing_shadow( File "/var/task/botocore/client.py", line 357, in _api_call return self._make_api_call(operation_name, kwargs) File "/var/task/thundra/integrations/modules/botocore.py", line 27, in _wrapper return INTEGRATIONS['default'].run_and_trace( File "/var/task/thundra/integrations/base_integration.py", line 68, in run_and_trace raise exception File "/var/task/thundra/integrations/base_integration.py", line 40, in run_and_trace response = self.actual_call(wrapped, args, kwargs) File "/var/task/thundra/integrations/base_integration.py", line 73, in actual_call return wrapped(args, kwargs) File "/var/task/botocore/client.py", line 662, in _make_api_call http, parsed_response = self._make_request( File "/var/task/botocore/client.py", line 682, in _make_request return self._endpoint.make_request(operation_model, request_dict) File "/var/task/botocore/endpoint.py", line 102, in make_request return self._send_request(request_dict, operation_model) File "/var/task/botocore/endpoint.py", line 136, in _send_request while self._needs_retry(attempts, operation_model, request_dict, File "/var/task/botocore/endpoint.py", line 253, in _needs_retry responses = self._event_emitter.emit( File "/var/task/botocore/hooks.py", line 356, in emit return self._emitter.emit(aliased_event_name, kwargs) File "/var/task/botocore/hooks.py", line 228, in emit return self._emit(event_name, kwargs) File "/var/task/botocore/hooks.py", line 211, in _emit response = handler(**kwargs) File "/var/task/botocore/retryhandler.py", line 183, in call if self._checker(attempts, response, caught_exception): File "/var/task/botocore/retryhandler.py", line 250, in call should_retry = self._should_retry(attempt_number, response, File "/var/task/botocore/retryhandler.py", line 277, in _should_retry return self._checker(attempt_number, response, caught_exception) File "/var/task/botocore/retryhandler.py", line 316, in call checker_response = checker(attempt_number, response, File "/var/task/botocore/retryhandler.py", line 222, in call return self._check_caught_exception( File "/var/task/botocore/retryhandler.py", line 359, in _check_caught_exception raise caught_exception File "/var/task/botocore/endpoint.py", line 200, in _do_get_response http_response = self._send(request) File "/var/task/botocore/endpoint.py", line 269, in _send return self.http_session.send(request) File "/var/task/botocore/httpsession.py", line 281, in send raise SSLError(endpoint_url=request.url, error=e) botocore.exceptions.SSLError: SSL validation failed for https://data.iot.us-west-2.amazonaws.com/things/xxx/shadow [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1124)

rabryan commented 3 years ago

Having the same exact issue in us-east-2

swetashre commented 3 years ago

@robert9111 - Thank you for your post. I would some more information in order to debug the issue.

dbasedow commented 3 years ago

I had the same issue in eu-central-1. AWS support advised to switch to the ATS endpoint: https://aws.amazon.com/blogs/iot/aws-iot-core-ats-endpoints/ I think there is a different underlying reason this issue suddenly popped up. But for now this works.

rabryan commented 3 years ago

@dbasedow that works for me as well. For anyone else, to use the ATS endpoint you need to explicitly specify it when you create your iot-data client:

boto3.client('iot-data', endpoint_url=IOT_DATA_EP)

where IOT_DATA_EP is the output of this command (with https:// prepended)

aws iot describe-endpoint --endpoint-type iot:Data-ATS

RobertP09 commented 3 years ago

Above commend DID solve the issue. Appreciate it. Definitely unsure what was causing the issue.

mretallack commented 3 years ago

Hi, I have had this issue as well.

For me, we also use Request, which pulls in certifi.

https://github.com/certifi/python-certifi/commits/master

certifi has been updated from 2020.11.08 to 2020.12.05, and during this time, the "VeriSign Class 3 Public Primary Certification Authority - G5" has been removed. For us this means that the "data.iot.eu-west-2.amazonaws.com" is no longer validated.

2020.11.08: https://github.com/certifi/python-certifi/blob/015cba9d2492a4cddaf5efe40666c18a2b259c93/certifi/cacert.pem

2020.12.05: https://github.com/certifi/python-certifi/blob/45a64658872a94a83c4b70fce02a96f0f29895e6/certifi/cacert.pem

So for now we have had to roll back python-certifi until a correct solution is found.

Not sure what the real/correct solution would be (other than using "endpoint_url=IOT_DATA_EP" as above).

johnp789 commented 3 years ago

This is one way to discover the ATS endpoint within an application, assuming IOT_REGION has been set.

def get_endpoint():
    iot_client = boto3.client("iot", region_name=os.getenv("IOT_REGION"))
    return iot_client.describe_endpoint(endpointType="iot:Data-ATS").get("endpointAddress")

IOT_DATA_ENDPOINT = f"https://{get_endpoint()}"

Pass IOT_DATA_ENDPOINT in the endpoint_url parameter when creating an iot-data client.

swetashre commented 3 years ago

This error is because currently the endpoint is not sending a valid cert. So it is recommended to use either ATS datapoint. IoT data endpoint is currently legacy.

https://docs.aws.amazon.com/iot/latest/developerguide/iot-connect-devices.html

Every customer has an iot:Data-ATS and an iot:Data endpoint. Each endpoint uses an X.509 certificate to authenticate the client. We strongly recommend that customers use the newer iot:Data-ATS endpoint type to avoid issues related to the widespread distrust of Symantec certificate authorities. We provide the iot:Data endpoint for devices to retrieve data from old endpoints that use VeriSign certificates for backward compatibility. 
gideon-maina commented 3 years ago

Also just got this issue this morning for an IoT service in us-east-2. @rabryan suggestion worked great. The strange thing is previously everything has been working on well, I just had boto3.client('iot-data') and all was good.

Has there been a good explanation as to how this issue would just pop up?

ignasbol commented 3 years ago

Only started getting this on Monday (25th Jan) in eu-west-2. Strange to see such a delay for this region. Setting ATS endpoint worked.

@dbasedow @robert9111 @gideon-maina there is some explanation here: https://forums.aws.amazon.com/thread.jspa?messageID=967311

Seems like lack of timely communication from AWS.

sim1234 commented 3 years ago

This is still a problem. Does amazon even care?

champaltaf commented 2 years ago

can anyone help me on the above issue

farzadpanahi commented 1 year ago

This error is because currently the endpoint is not sending a valid cert. So it is recommended to use either ATS datapoint. IoT data endpoint is currently legacy.

https://docs.aws.amazon.com/iot/latest/developerguide/iot-connect-devices.html

Every customer has an iot:Data-ATS and an iot:Data endpoint. Each endpoint uses an X.509 certificate to authenticate the client. We strongly recommend that customers use the newer iot:Data-ATS endpoint type to avoid issues related to the widespread distrust of Symantec certificate authorities. We provide the iot:Data endpoint for devices to retrieve data from old endpoints that use VeriSign certificates for backward compatibility. 

@swetashre why not fix boto3 to use the newer iot:Data-ATS endpoint by default to avoid all these issues or at least update documentation to mention you need to specify endpoint-url ?