aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.84k stars 678 forks source link

Lack of `verify` input to customize SSL verify option limits smooth usage of the package modules #2836

Open nikfio opened 1 month ago

nikfio commented 1 month ago

Describe the bug

Hi guys,

by using the package I am experiencing that that the lack of verify input to allow to pass a custom SSL certificate is becoming an issue.

The following error occurs

*** botocore.exceptions.SSLError: SSL validation failed for <AWS-RESOURCE-URI> 
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed:
unable to get local issuer certificate (_ssl.c:1007)

Of course, as explained here in this previous issue #1157, the problem can be solved by patching the package moduel functions and add the verify input which will then be forwarded to internal aws client object instantiation.

How to Reproduce

set yourself behind a proxy (for example) or any other network entity so that you would need to pass its specific SSL certificate to aws sdk client instatiation.

By running

import awswrangler as 

session = boto3_Session()

res_uri = 'AWS-RESOURCE-URI'

des = aws_s3.describe_objects(res_uri, boto3_session=session)

run the call and you will likely step into the following error:

*** botocore.exceptions.SSLError: SSL validation failed for <AWS-RESOURCE-URI>
 [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: u
nable to get local issuer certificate (_ssl.c:1007)

Expected behavior

forward the verify input of aws client object to aws wrangler calls when possible. So aws wrangler user can always set the verify option.

OS

ALL, issue is OS independent

Python version

=3.10

AWS SDK for pandas version

3.7.2

Final comment

In any case, would you agree or not? did you enconuter similar limitations?

I could set up a fork and then propose a pull request.

Many thanks, Nick

jaidisido commented 1 month ago

The issue here is that awswrangler does not expose the boto3 client, only the session. Because it's not possible to pass verify via the session, patching the client was the solution advised in #1157