Return the cluster that was used in the query execution as response metadata

trinodb / trino-gateway

https://trinodb.github.io/trino-gateway/

Apache License 2.0

147 stars 65 forks source link

Return the cluster that was used in the query execution as response metadata #465

Closed 0x726f closed 3 days ago

0x726f commented 2 weeks ago

We run several Trino clusters behind the gateway and would like the ability to store which cluster was used in query execution for each specific request on the client side.

Our client side stores a whole bunch of more useful metadata about each query executed within our application, so it makes sense for us to store all the metadata in one place.

We're planning on exposing this information back to the client using HTTP response headers. I see that the Gateway code already unpacks the response header from the downstream requests and sends them back to the client; I presume that we'll just have to modify the code to also add the cluster name to the header

yuokada commented 2 weeks ago

That sounds good. But, there are cases that system administrators would like to conceal cluster names, I think. So, it's better to make it configurable whether we can expose cluster names or not.

harishnelakurthi commented 2 weeks ago

@yuokada - I have made this change configurable. Can I get a PR review on this please? https://github.com/trinodb/trino-gateway/pull/466

yuokada commented 2 weeks ago

Thanks. It looks good to me. ✅

harishnelakurthi commented 2 weeks ago

@yuokada - can you approve the PR please?

yuokada commented 2 weeks ago

I'm just a user of Trino. Sorry, I have no power to approve your PR.

Anyway, you need to first sign CLA to merge your PR. https://github.com/trinodb/trino-gateway/pull/466#issuecomment-2344544475

mosabua commented 2 weeks ago

I'm not convinced that using the HTTP headers and just add the cluster name for now is the right approach. If we add that now we probably end up with more and more requests to add all sorts of other data as HTTPS headers. I'm wondering if it would be better to find an encapsulating entity, maybe a session cookie or something else to just passed back and then individual users can stuff whatever they want into that without us having to adjust code. In fact I think this already might work as desired with the cookie based routing or it would be possible to adjust at least. wdyt @willmostly

harishnelakurthi commented 2 weeks ago

@mosabua - Not sure if I followed completely. Can you please point me the code on where the destination cluster info stuffed in the cookie today? Thanks!