trinodb / trino-gateway

https://trinodb.github.io/trino-gateway/
Apache License 2.0
162 stars 72 forks source link

Sticky routing based on next-uri fields #446

Open shk3 opened 2 months ago

shk3 commented 2 months ago

Hi folks,

Have we considered rewriting next-uri and info-uri directly from the responses in order to achieve query-level sticky routing?

The idea is kinda similar to Trino Proxy, where Trino Gateway proxies all requests. Then we bind the URLs in the following ways:

With this approach, for query-level sticky routing, we don't need to track which backend each query id gets assigned to. Instead, such assignment is retained on the client side.

The caveat is that for the Trino UI, we would need to develop a way for users to do a combined search queries across all backends as well as a summary of all backend's stats.

Has this approach been considered in the past? We could eliminate the dependency on the databases / caches. If cross-regional networking could be a concern, we could even change the URLs with different domains to avoid inter-regional proxying.

I know Trino Gateway's architecture is pretty much set, so it's not necessarily something we have to do now, but mostly a discussion just in case later on it's needed.

George

xkrogen commented 2 months ago

We talked a bit about making the GW more of a "full proxy" in one of the recent GW dev syncs. It potentially unlocks a lot of new capabilities.

I like the idea you've proposed here of embedding this state into the client instead of storing it on the GW side. Tracking when a query has finished, and thus its state can be cleaned up, is an annoying process. Right now we just have a periodic task, every 2 hours, to clear our query records older than a configurable time window (but that query may actually still be running!): https://github.com/trinodb/trino-gateway/blob/f50b09d5f81ccf5c72efce345a4235727c879c06/gateway-ha/src/main/java/io/trino/gateway/ha/persistence/JdbcConnectionManager.java#L72-L82

Moving it to the client is in line with Trino philosophy in general, IMO, like how we implement session properties and prepared statements on the client-side.

For the UI, I think as you said, we could do a fan-out that pulls query results from each backend ... That also has the benefit of not having two copies of the same data (query IDs / query history stored on both GW and Coordinator).

Curious to hear what others think, but personally at first pass I like the idea. One thing we should consider is whether this would make it harder to implement other new functionality in the future.

shk3 commented 2 months ago

One thing we should consider is whether this would make it harder to implement other new functionality in the future.

Yes! This is the exact concern I have too.

We evaluated Trino Gateway vs running Envoy with a query ID cache vs just getting a thin layer of rewriting headers for next-uri in combination with some cloud load balancers a while ago. It's great to see that Trino Gateway is now officially part of Trino project and is collaborating with Trino!

We could actually achieve this next-uri design even as of today with the current Trino Gateway, if we tweak the X-Forwarded-* headers rewriting logic in some way and put the Trino coordinators on their own domains (eg. trino-gw.mydomain, trino-1.mydomain, trino-2.mydomain). In this way, Trino Gateway effectively acts as a query dispatcher, and the subsequent calls won't go through Trino Gateway. However, I'm worried about creating yet-another a snowflake use case for Trino Gateway. So, let's see if this idea could fit into Trino Gateway's bigger design in anyway and doesn't break any functionality Trino Gateway wants to support.

oneonestar commented 2 months ago

I had been thinking about routing using QueryID. When Trino coordinator starts, it generates a random coordinatorId and embed it into the last part in QueryID. (ref)

If we can keep track of the coordinatorId for each cluster, we can route it to the corresponding cluster without any additional info.

For example, all the query ID from the same coordinator have the same suffix:

Cluster A (tr8tg):
20240801_040236_47295_tr8tg
20240801_040244_44562_tr8tg
20240801_040245_41234_tr8tg

Cluster B (fejs4):
20240801_040301_24461_fejs4
20240801_040302_21235_fejs4
20240801_040303_25678_fejs4
vishalya commented 2 months ago

Having a coordinator (or cluster id) as a part of the trino protocol is a good idea. This could also solve the issue# 465

oneonestar commented 1 month ago

Currently we can't obtain the coordinatorId through Trino Coordinator's API. It'll be great if we can obtain this info from API when doing cluster health check.

mosabua commented 1 month ago

Ideally we chat about this with @wendigo and @electrum .. might be good to wrap some of this into the current work on client protocol.

wendigo commented 3 weeks ago

How can I help here? :)

mosabua commented 2 weeks ago

I think it would be great if you can chime in at https://github.com/trinodb/trino/pull/23910 and help there and also take this into account for the spooling protocol work @wendigo

oneonestar commented 2 weeks ago

Looks like we can modify the infoUri and nextUri using X-Forwarded-XX from https://github.com/trinodb/trino/pull/22227.

protocol://{X-Forwarded-Host}/{X-Forwarded-Prefix}/v1/statement/...
$ curl -XPOST -vvvv http://127.0.0.1:8080/v1/statement -d "SELECT 1" -H "X-Trino-User: star"
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> POST /v1/statement HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.7.1
> Accept: */*
> X-Trino-User: star
> Content-Length: 8
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 8 bytes
< HTTP/1.1 200 OK
< Date: Wed, 30 Oct 2024 15:04:09 GMT
< Vary: Accept-Encoding
< Content-Type: application/json
< X-Content-Type-Options: nosniff
< Content-Length: 595
<
{"id":"20241030_150409_00000_zhg55","infoUri":"http://127.0.0.1:8080/ui/query.html?20241030_150409_00000_zhg55","nextUri":"http://127.0.0.1:8080/v1/statement/queued/20241030_150409_00000_zhg55/y3ad3f7e909dd59fa86eaf43c51a1f45bed0357f0/1","stats":{"state":"QUEUED","queued":true,"scheduled":false,"nodes":0,"totalSplits":0,"queuedSplits":0,"runningSplits":0,"completedSplits":0,"cpuTimeMillis":0,"wallTimeMillis":0,"queuedTimeMillis":0,"elapsedTimeMillis":0,"processedRows":0,"processedBytes":0,"physicalInputBytes":0,"physicalWrittenBytes":0,"peakMemoryBytes":0,"spilledBytes":0},"warnings":[]}
* Connection #0 to host 127.0.0.1 left intact

$ curl -XPOST -vvvv http://127.0.0.1:8080/v1/statement -d "SELECT 1" -H "X-Trino-User: star" -H "X-Forwarded-Prefix: some-prefix"
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> POST /v1/statement HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.7.1
> Accept: */*
> X-Trino-User: star
> X-Forwarded-Prefix: some-prefix
> Content-Length: 8
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 8 bytes
< HTTP/1.1 200 OK
< Date: Wed, 30 Oct 2024 15:04:36 GMT
< Vary: Accept-Encoding
< Content-Type: application/json
< X-Content-Type-Options: nosniff
< Content-Length: 619
<
{"id":"20241030_150436_00001_zhg55","infoUri":"http://127.0.0.1:8080/some-prefix/ui/query.html?20241030_150436_00001_zhg55","nextUri":"http://127.0.0.1:8080/some-prefix/v1/statement/queued/20241030_150436_00001_zhg55/y6f4617b74af025647da20558223ecfbf0dc324ee/1","stats":{"state":"QUEUED","queued":true,"scheduled":false,"nodes":0,"totalSplits":0,"queuedSplits":0,"runningSplits":0,"completedSplits":0,"cpuTimeMillis":0,"wallTimeMillis":0,"queuedTimeMillis":0,"elapsedTimeMillis":0,"processedRows":0,"processedBytes":0,"physicalInputBytes":0,"physicalWrittenBytes":0,"peakMemoryBytes":0,"spilledBytes":0},"warnings":[]}
* Connection #0 to host 127.0.0.1 left intact

$ curl -XPOST -vvvv http://127.0.0.1:8080/v1/statement -d "SELECT 1" -H "X-Trino-User: star" -H "X-Forwarded-Prefix: some-prefix" -H "X-Forwarded-Host: some-host.com"
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> POST /v1/statement HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.7.1
> Accept: */*
> X-Trino-User: star
> X-Forwarded-Prefix: some-prefix
> X-Forwarded-Host: some-host.com
> Content-Length: 8
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 8 bytes
< HTTP/1.1 200 OK
< Date: Wed, 30 Oct 2024 15:05:00 GMT
< Vary: Accept-Encoding
< Content-Type: application/json
< X-Content-Type-Options: nosniff
< Content-Length: 617
<
{"id":"20241030_150500_00002_zhg55","infoUri":"http://some-host.com/some-prefix/ui/query.html?20241030_150500_00002_zhg55","nextUri":"http://some-host.com/some-prefix/v1/statement/queued/20241030_150500_00002_zhg55/y284ba1cdd09640c34cec06ca74afeb21acf10123/1","stats":{"state":"QUEUED","queued":true,"scheduled":false,"nodes":0,"totalSplits":0,"queuedSplits":0,"runningSplits":0,"completedSplits":0,"cpuTimeMillis":0,"wallTimeMillis":0,"queuedTimeMillis":0,"elapsedTimeMillis":0,"processedRows":0,"processedBytes":0,"physicalInputBytes":0,"physicalWrittenBytes":0,"peakMemoryBytes":0,"spilledBytes":0},"warnings":[]}
* Connection #0 to host 127.0.0.1 left intact

One interesting thing is the url with prefix won't work. Coordinator will return 404 for that.

$ curl -XPOST http://127.0.0.1:8080/v1/statement -d "SELECT 1" -H "X-Trino-User: star" -H "X-Forwarded-Prefix: some-prefix"
{"id":"20241030_153616_00003_z5drh","infoUri":"http://127.0.0.1:8080/some-prefix/ui/query.html?20241030_153616_00003_z5drh","nextUri":"http://127.0.0.1:8080/some-prefix/v1/statement/queued/20241030_153616_00003_z5drh/yb1171cfcc1af0a43017540da914037c349f8753c/1","stats":{"state":"QUEUED","queued":true,"scheduled":false,"nodes":0,"totalSplits":0,"queuedSplits":0,"runningSplits":0,"completedSplits":0,"cpuTimeMillis":0,"wallTimeMillis":0,"queuedTimeMillis":0,"elapsedTimeMillis":0,"processedRows":0,"processedBytes":0,"physicalInputBytes":0,"physicalWrittenBytes":0,"peakMemoryBytes":0,"spilledBytes":0},"warnings":[]}

$ curl http://127.0.0.1:8080/some-prefix/v1/statement/queued/20241030_153616_00003_z5drh/yb1171cfcc1af0a43017540da914037c349f8753c/1
Error 404 Not Found: HTTP 404 Not Found%
shk3 commented 2 weeks ago

Yes! This X-Fowarded-xx is exactly what I was trying to propose! :)

wendigo commented 2 weeks ago

@oneonestar yeah, we plan to add support for it to the client as well, but for now the server-to-server should work just fine

shk3 commented 2 weeks ago

One interesting thing is the url with prefix won't work. Coordinator will return 404 for that.

@oneonestar Sorry I missed out this part.

What I had in mind is that we could use X-Forwarded-xx headers to point the next-uri / info-uri to the configured external URL of the backends, which doesn't go through Trino Gateway anymore. Say you have backend1.somehost.com/some-prefix publicly exposing Trino backend1 coordinator through nginx or some sort of Gateway. In Trino Gateway, we can use these headers to make the returned next-uri / info-uri pointing to the external-url directly -- something like http://backend1.somehost.com/some-prefix/v1/statement/queued/20241030_153616_00003_z5drh/yb1171cfcc1af0a43017540da914037c349f8753c/1.

Alternatively, if we want to manipulate next-uri / info-uri with some-prefix on the same host with Trino Gateway, we would need to set up some proxy rules to proxy the requests to the proper clusters based on the prefix, and when Trino coordinator gets this request, the URL won't contain that prefix anymore. When Trino Gateway sees that prefix, it knows which backend this request needs to go.