Open c0c0n3 opened 3 years ago
Ideally, we should add some QL/Timescale integration tests where we test SSL connections. Timescale container set up should be similar to what we already have in the timescale-container/test
dir, see:
Stale issue message
Here's some more debug info we can use later to concoct a fix.
First off, start Timescale w/ SSL using this docker compose:
Notice the server certificates are self-signed. Now if you start a Python interpreter in e.g. the QL image, you can see what's going on
>>> import certifi # see https://stackoverflow.com/questions/50236117
>>> import ssl
>>> import pg8000
Now the implementation of ssl
must've changed since we tested. In fact, it looks like that's where all hell actually breaks loose, pg8000
just sits there peacefully.
>>> pg8000.paramstyle = "qmark"
>>> con = pg8000.connect(host='localhost', port=5432, ssl_context={}, database='quantumleap', user='quantumleap', password='*')
...
AttributeError: 'dict' object has no attribute 'wrap_socket'
as expected. Now the pg8000
implementation changed too since we tested and if you want to use a default SSL context, you should pass in True
instead of {}
:
Ah, so we can fix it! Or can we?
>>> con = pg8000.connect(host='localhost', port=5432, ssl_context=True, database='quantumleap', user='quantumleap', password='*')
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/Users/andrea/.local/share/virtualenvs/ngsi-timeseries-api-MeJ80LMF/lib/python3.8/site-packages/pg8000/__init__.py", line 56, in connect
return Connection(
File "/Users/andrea/.local/share/virtualenvs/ngsi-timeseries-api-MeJ80LMF/lib/python3.8/site-packages/pg8000/core.py", line 674, in __init__
self._usock = ssl_context.wrap_socket(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1040, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)
Computer says no. Surely this could be b/c Postgres got started w/ a self-signed cert. In fact,
>>> ctx = ssl.create_default_context(ssl.Purpose.SERVER_AUTH)
>>> con = pg8000.connect(host='localhost', port=5432, ssl_context=ctx, database='quantumleap', user='quantumleap', password='*')
...
# same error as before
But here's a little surprise
>>> certs = ctx.load_default_certs()
>>> print(certs)
None
So yah, even if we had a proper cert, I don't think we'd go too far. Could it be our version of certifi
is too old? Anyhoo, if you don't care about server authentication, you could work around this
>>> ctx = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
>>> con = pg8000.connect(host='localhost', port=5432, ssl_context=ctx, database='quantumleap', user='quantumleap', password='*')
>>> con.run("select count(*) from mtv.etdevice")
([10],)
Hi, what is the status on this? For a productive setup SSL is a must have. Are there any plans to fix this issue?
For anyone how wants to run Quantum Leap with TimescaleDB without a SSL connection I can confirm that the following workaround is possible:
quantumleap
database in your TimescaleDB pg_hba.conf by adding this line of configuration:
hostnossl quantumleap all all md5
POSTGRES_USE_SSL:f
Hi @Panzki
what is the status on this?
Can you check if it's still broken in the latest QL release v0.8.3?
For a productive setup SSL is a must have
Agree :-)
Are there any plans to fix this issue?
Not a priority at the moment, but we'll try fixing this some time in the next couple of months if it's still broken...
Hi @c0c0n3! Yes, it's still broken in the v0.8.3 release.
@bkdkmd oh deary deary, bugs never sleep :-)
@pooja1pathak do you guys have dev cycles to look into this and give it high priority?
s this bug also still processed or do you have to work without SSL now? It is now 2023 and the bug has been known since 2020 and runs through the versions. In the current version quantumleap:0.8.3, this now falls on our feet when we want to use a ReplicaSet of the TimescaleDB. :-(
hello @FR-ADDIX :-)
This is still a bug unfortunately. I share your frustration as a developer, but am also sure you'll appreciate we don't have enough resources to work on Quantum Leap on a full-time basis at the moment, so it's sort of a best-effort approach for us, mainly driven by what our clients request.
This is open-source after all, so if you're willing to roll up your sleeves and contribute this fix to the community we'll gladly merge your PR!
Describe the bug
QuantumLeap bombs out when attempting to connect to Timescale over SSL. Here's the stack trace from the logs (irrelevant lines omitted):
To Reproduce
Configure QuantumLeap to use the Timescale backend for a tenant named "vecciano" and set the
POSTGRES_USE_SSL
env var totrue
. Then run:You should get a nasty
500
back and if you look at the logs, you should be able to see a fat stack trace like the one above.Expected behavior
QuantumLeap should be able to establish an SSL connection to Timescale.
Environment
0.7.6
11.4
with Timescale extension1.3.2
12.4
with Timescale extension1.7.2
Additional context
Issue cropped up in Orchestra prod when connecting QL to the old
pg-patroni
cluster and the then to the newpgsql-patroni
instance. If you get a Python shell on the QL pod, you can clearly see where things go awry: