trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.38k stars 2.98k forks source link

Working with Trino and authentication inside a service mesh. #18387

Open tkeller-moxe opened 1 year ago

tkeller-moxe commented 1 year ago

I am trying to understand the best way to handle using trino when deployed inside a Kubernetes cluster leveraging Istio as a service mesh, With Authentication ENABLED in trino.

Problem:

My Hopes:

My Current solution:

Other notes:

Some additional comments on Istio

shantanu-dahiya commented 2 months ago

Did you come across a solution for this?

tkeller-moxe commented 2 months ago

If I'm remembering correctly the solution here was to set "http-server.process-forwarded=true" This configuration flag and ensure istio passes the X-Forwarded-Proto=https which trino will then respect.

hashhar commented 2 months ago

Correct, if you want to proxy authentication or TLS then http-server.process-forwarded + making sure the proxy passes X-Forwarded-* headers is the solution.

Closing, feel free to re-open if needed.

jinyangli34 commented 1 week ago

Hi @hashhar , another follow up question on this topic:

How to set client connection to use HTTP if Istio is used on client side as well? Currently, we have to create client TLS on top of istio TLS to make client connection to proxy. Then use http-server.process-forwarded=true to remove the extra TLS layer and connects proxy to coordinator with HTTP.

Currently it enforces HTTPs connection if authentication is used.

shantanu-dahiya commented 1 week ago

@jinyangli34 From my reading into the Trino JDBC driver code, it can't be done. You have to have SSL turned on at the JVM level on the client, there's no way to do auth without it.

tkeller-moxe commented 1 week ago

Sadly if authentication is enabled, regardless of istio you MUST have TLS somewhere in the mix for trino to be happy.

Where we ended up is we have an Istio TLS terminating gateway serving access to trino on an internal domain. HTTPS -> Gateway -> ISTIO TLS Termination Gateway + X-Forwarded-Proto -> Istio Sidecar -> Trino

The http-server.process-forwarded=true flag lets you NOT enable TLS on trino directly. Meaning you don't need an active TLS connection to the coordinator. And therefore ISTIOs is enough.

I hope that is clear!

jinyangli34 commented 1 week ago

Thanks @shantanu-dahiya and @tkeller-moxe for the comment. That's also align with my understanding.

Unfortunately, the complicated part is not enable TLS on Trino coordinator, which is an one time task. But setup TLS on all the clients in different environments is complicated, especially as no one else need to do so with Istio.

tkeller-moxe commented 1 week ago

The method we are using is meant to reduce the complexity here.

We have a gateway with a signed https certificate. We use lets encrypt as the provider. The Istio gateway uses this certificate. This is the only per environment setup that has to happen. Its just a load balancer that terminated the HTTPS to http, and forwards the traffic to Trino.

Trino DOES NOT and SHOULD NOT have TLS configured at all it should have the X-Forwarded-Proto flag enabled otherwise it will reject any request that trys to authenticate.

Once these are setup you can connect to trino by the domain of the gateway

https://trino-staging.your-company.com and this will work for all clients. Its a standard HTTPs connection at this point :)

jinyangli34 commented 1 week ago

We have a gateway with a signed https certificate. We use lets encrypt as the provider. The Istio gateway uses this certificate. This is the only per environment setup that has to happen. Its just a load balancer that terminated the HTTPS to http, and forwards the traffic to Trino.

I understand a gateway proxy can help terminate HTTPs and send HTTP to Trino. But all Trino clients still need TLS setup.

shantanu-dahiya commented 1 week ago

If the Trino client needs TLS, then surely we have to turn on TLS on Trino too, right? How else will Trino decrypt the TLS from the client's JVM?

tkeller-moxe commented 1 week ago

If you are using Client Certificates yes that is a very different problem and Trino does not recommend using client certificates unless you HAVE TO by organization policy. JWTs and other access patterns are considered the current standard.

If you are using just https with JWT or password based auth. No the trino cluster does NOT need TLS enabled. The http-server.process-forwarded=true tells trino to accept a request even if its not TLS as long as it has the X-Forwarded-Proto: HTTPS header set. So trino can run without any TLS configuration.

For clients connecting to https, If you are using a https certificate signed by a public authority (lets encrypt, AWS, etc) you do not need todo any additional setup as the keys are distributed on most systems, just like in the normal webbrowser use case.

shantanu-dahiya commented 1 week ago

Let me see if I understand this. So, the Trino JDBC driver requires TLS to be on in the client's JVM, to do authentication of any type. Now let's assume we have a service mesh too. So, the client encrypts using the JVM's keys, then the Istio sidecar encrypts it using mTLS. Trino's Istio sidecar decrypts the mTLS. Surely now Trino will need to be able to decrypt the JVM-level encryption, and therefore needs to have the appropriate certs to do so. This would lead to the conclusion that any authenticated connection between a client and Trino requires Trino to have TLS set up, even inside the service mesh. Where did I go wrong in this analysis?

tkeller-moxe commented 1 week ago

Let me ask a few more clarifiers as well What do you mean by Client's JVM? Trino, even with jdbc, just talks over HTTP under the hood. I think that is part of what we are hung up on :)

jinyangli34 commented 1 week ago

Let me ask a few more clarifiers as well What do you mean by Client's JVM? Trino, even with jdbc, just talks over HTTP under the hood. I think that is part of what we are hung up on :)

Trino client requires client use HTTPs if auth is used. otherwise it will throw error So when using JDBC, client JVM needs to get certificates/keys ready to establish TLS connection.

While other system, like MySQL, doesn't require TLS when using auth (like Kerberos).

jinyangli34 commented 2 days ago

We found that using SSLVerification on Trino CLI/JDBC can remove the restriction on requiring certificate on Client side. If using Pyhive/presto, override the Session to specify verify=False and pass in as requests_session can avoid using certs. For trino-python-client, haven't found a good way to skip the client cert.