trinodb / trino-gateway

https://trinodb.github.io/trino-gateway/
Apache License 2.0
144 stars 60 forks source link

HTTP 500 Error: NoSuchElementException When Adhoc Cluster is Busy #447

Open alaturqua opened 3 weeks ago

alaturqua commented 3 weeks ago

Description:

We are encountering an issue with the Trino Gateway setup when querying multiple clusters. Below are the details of our current configuration and the problem:

Configuration:

Issue: When the adhoc cluster becomes busy, jdbc connection queries for stats time out, and the Trino Gateway becomes unreachable. The same thing happens, if we deactivate the adhoc cluster, while redeployment or restarts of trino cluster.

This results in the following error message:

HTTP ERROR 500 java.util.NoSuchElementException: No value present
URI: /v1/statement
STATUS: 500
MESSAGE: java.util.NoSuchElementException: No value present
SERVLET: trinoRouter
CAUSED BY: java.util.NoSuchElementException: No value present

Stack Trace:

java.util.NoSuchElementException: No value present
    at java.base/java.util.Optional.orElseThrow(Optional.java:377)
    at io.trino.gateway.ha.router.QueryCountBasedRouter.provideAdhocBackend(QueryCountBasedRouter.java:227)
    at io.trino.gateway.ha.handler.QueryIdCachingProxyHandler.getBackendFromRoutingGroup(QueryIdCachingProxyHandler.java:345)
    at io.trino.gateway.ha.handler.QueryIdCachingProxyHandler.rewriteTarget(QueryIdCachingProxyHandler.java:313)
    at io.trino.gateway.proxyserver.ProxyServletImpl.rewriteTarget(ProxyServletImpl.java:92)
    at org.eclipse.jetty.proxy.ProxyServlet.service(ProxyServlet.java:51)
    at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
    at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1665)
    at io.trino.gateway.proxyserver.RequestFilter.doFilter(RequestFilter.java:40)
    at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
    at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1580)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1553)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
    at org.eclipse.jetty.proxy.ConnectHandler.handle(ConnectHandler.java:203)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
    at org.eclipse.jetty.server.Server.handle(Server.java:563)
    at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
    at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
    at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
    at java.base/java.lang.Thread.run(Thread.java:1583)

Steps to Reproduce:

  1. Configure Trino Gateway with the above-mentioned clusters and routing groups.
  2. Use JDBC to query the adhoc cluster.
  3. Observe the error when the adhoc cluster is busy.

Expected Behavior: The Trino Gateway should handle busy clusters gracefully without causing a 500 error.

Actual Behavior: The gateway becomes unreachable with a 500 error when the adhoc cluster is busy.

Environment:

mosabua commented 3 weeks ago

What do you mean by "handle busy clusters gracefully" .. there is no queue or so in Trino Gateway .. it just routes traffic to clusters. In this case if adhoc is busy and no other cluster is available for routing.. what should the Trino Gateway do?

Chaho12 commented 3 weeks ago

Hmm. i don't think we need a queue in Trino Gateway for now. I shared in slack once, but i think we should improve the way gateway handles how we return routing failure (due to whatever reason).

As of now, it returns 500 error page which is not that kind/intuitive to user on what it means.

] trino>