Closed szhang63 closed 1 year ago
Thanks for reporting this @szhang63 !
When you had this working in dbt-redshift 1.4.0, are you using any of the sslmode
, sslcert
, sslkey
, and/or sslrootcert
parameters? If so, which ones?
@dbeatty10 I didn't specify any of those params, so whatever the default is. Here's the output of dbt debug
when I run using 1.4:
18:53:55 Running with dbt=1.4.6
dbt version: 1.4.6
python version: 3.9.16
python path: /Users/shirley.zhang/.conda/envs/dbt-1.4/bin/python
os info: macOS-10.16-x86_64-i386-64bit
Using profiles.yml file at /Users/shirley.zhang/.dbt/profiles.yml
Using dbt_project.yml file at /Users/shirley.zhang/repos/dbt/ds-dbt/dbt_project.yml
Configuration:
profiles.yml file [OK found and valid]
dbt_project.yml file [OK found and valid]
Required dependencies:
- git [OK found]
Connection:
host: localhost
port: 5539
user:
database: XXXX
schema: user__shirley
search_path: None
keepalives_idle: 0
sslmode: None
method: database
cluster_id: None
iam_profile: None
iam_duration_seconds: 900
Connection test: [OK connection ok]
All checks passed!
That debug out is very helpful -- thank you ❤️
I agree with your assessment that allowing passing a ssl
connection parameter in profiles.yml
could be a viable path forward here. Ideally, we wouldn't need to add anything extra though, so I'd like to explore other options first before returning to that one. It's very possible it's where we'll end up though!
Everything else that follows are questions for the Redshift team so they can help us choose a path forward.
Between dbt-redshift 1.4 and 1.5, the Python driver switched from psycopg2
vs. redshift_connector
.
I wonder why redshift_connector
has the ssl
parameter? I wonder if it could be inferred from the sslmode
?
Reading the code here, maybe this is a relevant translation from sslmode
to ssl
?
sslmode |
ssl |
---|---|
disable |
False |
allow |
Raise "unsupported sslmode" error |
prefer |
Raise "unsupported sslmode" error |
require |
True |
verify-ca |
True |
verify-full |
True |
The rationale behind raising the "unsupported sslmode" errors is because I couldn't see evidence that they would have any effect in redshift_connector
. Are they actually valid options that can have an effect?
From what I can tell, psycopg2 doesn't have the ssl
parameter and it just passes along keyword arguments like sslmode
to the connection.
sslmode
between Postgres and RedshiftBoth Redshift and Postgres had helpful docs related to sslmode
, so I combined them into a single table here:
sslmode |
Eavesdropping protection | MITM protection | Description | Statement |
---|---|---|---|---|
disable |
No | No | SSL is disabled and the connection is not encrypted. | I don't care about security, and I don't want to pay the overhead of encryption. |
allow |
Maybe | No | SSL is used if the server requires it. | I don't care about security, but I will pay the overhead of encryption if the server insists on it. |
prefer |
Maybe | No | SSL is used if the server supports it. Amazon Redshift supports SSL, so SSL is used when you set sslmode to prefer . |
I don't care about encryption, but I wish to pay the overhead of encryption if the server supports it. |
require |
Yes | No | SSL is required. | I want my data to be encrypted, and I accept the overhead. I trust that the network will make sure I always connect to the server I want. |
verify-ca |
Yes | Depends on CA policy | SSL must be used and the server certificate must be verified. | I want my data encrypted, and I accept the overhead. I want to be sure that I connect to a server that I trust. |
verify-full |
Yes | Yes | SSL must be used. The server certificate must be verified and the server hostname must match the hostname attribute on the certificate. | I want my data encrypted, and I accept the overhead. I want to be sure that I connect to a server I trust, and that it's the one I specify. |
Thanks for the detailed breakdown @dbeatty10! My reading of the redshift_connector
code is that the only supported sslmode
options are verify-ca
and verify-full
, with the former being the default. I think it'd be great if simply passing in sslmode='disabled'
would set ssl=False
, but that sadly doesn't seem to be the case, and would likely require a change to the redshift_connector
package.
>>> import redshift_connector
>>> c = redshift_connector.connect(user='', password='', host='localhost', database='XXXX', port=5539, sslmode='disable')
Traceback (most recent call last):
File "/Users/shirley.zhang/.conda/envs/dbt-1.5/lib/python3.9/site-packages/redshift_connector/core.py", line 619, in __init__
self._usock = ssl_context.wrap_socket(self._usock)
File "/Users/shirley.zhang/.conda/envs/dbt-1.5/lib/python3.9/ssl.py", line 501, in wrap_socket
return self.sslsocket_class._create(
File "/Users/shirley.zhang/.conda/envs/dbt-1.5/lib/python3.9/ssl.py", line 1041, in _create
self.do_handshake()
File "/Users/shirley.zhang/.conda/envs/dbt-1.5/lib/python3.9/ssl.py", line 1310, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/shirley.zhang/.conda/envs/dbt-1.5/lib/python3.9/site-packages/redshift_connector/__init__.py", line 340, in connect
return Connection(
File "/Users/shirley.zhang/.conda/envs/dbt-1.5/lib/python3.9/site-packages/redshift_connector/core.py", line 632, in __init__
raise InterfaceError("communication error", e)
redshift_connector.error.InterfaceError: ('communication error', SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)'))
Hoping to hear back from Amazon on the best path forward soon!
Thanks for investigating this issue @szhang63
What you've pointed out is exactly the issue. We will be working on adding support for the sslmode options that psycopg2 supports, which would require changes to be made in the redshift_connector package. In the meantime, we're working on translating all the sslmode options supported by psycopg2 to valid redshift_connector parameters.
Because psycopg2 only has sslmode, where redshift_connector has both ssl(boolean) and sslmode(either verify-ca or verify-full), we need to infer ssl and sslmode from sslmode provided by user.
Below are 2 viable options of such translation:
Customer facing parameter in dbt (sslmode ) |
How we can translate to redshift_connector | behavior in psycopg2 connector (sslmode) |
---|---|---|
disable | ssl=false | sslmode=disable |
allow | try with ssl=true, sslmode=verify-ca, if fails, customers has to disable ssl | first try with sslmode=disable, if fails, try with sslmode=verify-ca |
prefer | try with ssl=true, sslmode=verify-ca, if fails, customers has to disable ssl | first try with sslmode=verify-ca, if fails, try with sslmode=disable |
require | ssl=true, sslmode=verify-ca | ssl=true, sslmode=verify-ca |
verify-ca | ssl=true, sslmode=verify-ca | ssl=true, sslmode=verify-ca |
verify-full | ssl=true, sslmode=verify-full | ssl=true, sslmode=verify-full |
2.
Customer facing parameter in dbt (sslmode ) |
How we can translate to redshift_connector | behavior in psycopg2 connector (sslmode) |
---|---|---|
disable | ssl=false | sslmode=disable |
allow | try with ssl=false, if fails, ask customers to set sslmode as either verify-ca, or verify-full | first try with sslmode=disable, if fails, try with sslmode=verify-ca |
prefer | try with ssl=true, sslmode=verify-ca, if fails, customers has to disable ssl | first try with sslmode=verify-ca, if fails, try with sslmode=disable |
require | ssl=true, sslmode=verify-ca | ssl=true, sslmode=verify-ca |
verify-ca | ssl=true, sslmode=verify-ca | ssl=true, sslmode=verify-ca |
verify-full | ssl=true, sslmode=verify-full | ssl=true, sslmode=verify-full |
Note that the two options above are the same except for the behavior for when 'allow' is provided.
In parallel, we're working on supporting these sslmode options in redshift_connector. For the time being before changes are made in redshift_connector package, we're translating the unsupported sslmode options on the dbt-redshift side
Thanks for this nice write-up of options @jiezhen-chen! Both options 1 and 2 seem reasonable.
Option 1 is both a little more assertive from a security perspective and a little easier to explain, so let's go with it 👍
I agree. Will open up a PR for this!
fyi - here's a table with current behaviors of psycopg2 and redshift_connector, as well as the translation we decided to go forward with
Customer facing parameter in dbt (sslmode ) |
behavior in redshift connector (ssl, sslmode) | behavior in psycopg2 connector (sslmode) | Translating from psycopg2 to redshift_connector(parameters to pass to redshift_connector in dbt-redshift) |
---|---|---|---|
disable | ssl=defaulted to true, sslmode=verify-ca (sslmode of disable is not recognized by redshift_connector, therefore falling back to default of verify-ca) | sslmode=disable | set ssl=false |
allow | ssl=defaulted to true, sslmode=verify-ca | first try with sslmode=disableif fails, try with sslmode=verify-ca | ssl=true, sslmode=verify-ca, if fails, error out and ask customers to set sslmode as disable |
prefer | ssl=defaulted to true, sslmode=verify-ca | first try with sslmode=verify-caif fails, try with sslmode=disable | ssl=true, sslmode=verify-ca, if fails, error out and ask customers to set sslmode as disable |
require | ssl=defaulted to true, sslmode=verify-ca | ssl=true, sslmode=verify-ca | ssl=true, sslmode=verify-ca |
verify-ca | ssl=defaulted to true, sslmode=verify-ca | ssl=true, sslmode=verify-ca | ssl=true, sslmode=verify-ca |
verify-full | ssl=defaulted to true, sslmode=verify-full | ssl=true, sslmode=verify-full | ssl=true, sslmode=verify-full |
This should likely be documented as a breaking change in the 1.5 changelog. I previously had no ssl configuration at all and things were working, but now I need to install with the changes from https://github.com/dbt-labs/dbt-redshift/pull/439 and set sslmode
to disable
to get things to work. This is in addition to the breaking change that I need to set region
to be able to connect.
I'm also impacted by this. dbt 1.4 works fine.
Is this a regression in a recent version of dbt-redshift?
Current Behavior
After upgrading to dbt 1.5.1, we are unable to connect to Redshift due to an SSLCertVerificationError. Our Redshift connections are made through StrongDM, which may be affecting the certification? I've been able to connect with
redshift_connector
withssl=False
, so a quick fix would be to allow passing thessl
connection parameter in via the profile.yml.Expected/Previous Behavior
Connection to Redshift works as it did before.
Steps To Reproduce
Relevant log output
Environment
Additional Context
Using
redshift_connector
directly in python, full traceback for the connection errorUsing
ssl=False
works: