trinodb / trino-gateway

https://trinodb.github.io/trino-gateway/
Apache License 2.0
159 stars 72 forks source link

Error to route when requestAnalyzerConfig is True #487

Open ndrluis opened 1 month ago

ndrluis commented 1 month ago

Hello,

I activated the io.trino.gateway.ha.module.QueryCountBasedRouterProvider and have two routing groups, each with one cluster.

I also have a rule that directs queries from the "dbt-trino" source to the large routing group. However, when I execute multiple queries, some of them are being routed to the adhoc routing group. Is this expected behavior, considering that the QueryCount router might have higher priority than the routing rule, or is this a bug?

Routing Rule

---
name: "dbt"
description: "if query from dbt"
condition: "request.getHeader(\"X-Trino-Source\").startsWith(\"dbt-trino\")"
actions:
  - "result.put(\"routingGroup\", \"large\")"

Clusters

image

Routing History

image (1)
ndrluis commented 1 month ago

I removed the QueryCountBasedRouterProvider, but the latest query from dbt is being routed to the adhoc group. The difference is that the latest query is a CREATE TABLE statement. I conducted 5 tests, and in all of them, the latest query was routed to adhoc.

I feel that it might just be a coincidence, because when I try to run just a CREATE TABLE, the query is routed to the large routing group.

ndrluis commented 1 month ago

Sometimes this error appears in the log

2024-09-26T22:50:44.831Z    ERROR   http-worker-62  io.trino.gateway.ha.router.RuleReloadingRoutingGroupSelector    Error opening rules configuration file, using routing group header as default.
com.google.common.base.VerifyException: Identifier cannot be empty or null
    at com.google.common.base.Verify.verify(Verify.java:126)
    at io.trino.sql.tree.Identifier.isValidIdentifier(Identifier.java:133)
    at io.trino.sql.tree.Identifier.<init>(Identifier.java:56)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:212)
    at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:722)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:556)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:546)
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
    at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:702)
    at io.trino.sql.tree.QualifiedName.of(QualifiedName.java:42)
    at io.trino.gateway.ha.router.TrinoQueryProperties.qualifyName(TrinoQueryProperties.java:386)
    at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:335)
    at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
    at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
    at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
    at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
    at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
    at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
    at io.trino.gateway.ha.router.TrinoQueryProperties.processRequestBody(TrinoQueryProperties.java:197)
    at io.trino.gateway.ha.router.TrinoQueryProperties.<init>(TrinoQueryProperties.java:140)
    at io.trino.gateway.ha.router.RuleReloadingRoutingGroupSelector.findRoutingGroup(RuleReloadingRoutingGroupSelector.java:99)
    at io.trino.gateway.ha.handler.RoutingTargetHandler.getBackendFromRoutingGroup(RoutingTargetHandler.java:86)
    at io.trino.gateway.ha.handler.RoutingTargetHandler.lambda$getRoutingDestination$0(RoutingTargetHandler.java:66)
    at java.base/java.util.Optional.orElseGet(Optional.java:364)
    at io.trino.gateway.ha.handler.RoutingTargetHandler.getRoutingDestination(RoutingTargetHandler.java:66)
    at io.trino.gateway.proxyserver.RouteToBackendResource.postHandler(RouteToBackendResource.java:67)
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189)
    at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:159)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
    at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:274)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:266)
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:253)
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:696)
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:397)
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:349)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:358)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:312)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
    at org.eclipse.jetty.ee10.servlet.ServletHolder.handle(ServletHolder.java:736)
    at org.eclipse.jetty.ee10.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1614)
    at org.eclipse.jetty.ee10.servlet.ServletHandler$MappedServlet.handle(ServletHandler.java:1547)
    at org.eclipse.jetty.ee10.servlet.ServletChannel.dispatch(ServletChannel.java:824)
    at org.eclipse.jetty.ee10.servlet.ServletChannel.handle(ServletChannel.java:436)
    at org.eclipse.jetty.ee10.servlet.ServletHandler.handle(ServletHandler.java:464)
    at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:597)
    at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1060)
    at org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)
    at org.eclipse.jetty.server.handler.EventsHandler.handle(EventsHandler.java:81)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:151)
    at org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)
    at org.eclipse.jetty.server.handler.EventsHandler.handle(EventsHandler.java:81)
    at org.eclipse.jetty.server.Server.handle(Server.java:181)
    at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:648)
    at org.eclipse.jetty.server.internal.HttpConnection.onFillable(HttpConnection.java:403)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99)
    at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:478)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:441)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:293)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311)
    at org.eclipse.jetty.util.thread.MonitoredQueuedThreadPool$1.run(MonitoredQueuedThreadPool.java:73)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164)
    at java.base/java.lang.Thread.run(Thread.java:1570)
ndrluis commented 1 month ago

I discovered that this error only happens when the requestAnalyzerConfig is set to True. Here is an example of the query:

CREATE OR REPLACE TABLE "catalog"."database"."example_table"

WITH (
      "extra_properties" = MAP(
        ARRAY['optimizer.enabled', 'compaction.enabled'],
        ARRAY['false', 'false']
      ))
AS (

WITH cte_example_sheets AS (
  SELECT *
  FROM "catalog"."database"."example_source"
)

, cte_example AS (
  SELECT
    CAST(field1 AS VARCHAR) AS alias1
    , CAST(field2 AS VARCHAR) AS alias2
    , CAST(field3 AS VARCHAR) AS alias3
    , CAST(SPLIT_PART(field4, '-', 1) AS VARCHAR) AS alias4
    , CAST(SPLIT_PART(field4, '-', 2) AS VARCHAR) AS alias5
    , CAST(field5 AS VARCHAR) AS alias6
    , CAST(DATE_PARSE(field6, '%Y-%m-%d %H:%i') AS TIMESTAMP(6)) AS alias7
    , CAST(DATE_PARSE(field7, '%Y-%m-%d %H:%i') AS TIMESTAMP(6)) AS alias8
    , CAST(DATE_PARSE(field8, '%Y-%m-%d %H:%i') AS TIMESTAMP(6)) AS alias9
    , CAST(DATE_PARSE(field9, '%Y-%m-%d %H:%i') AS TIMESTAMP(6)) AS alias10
    , CAST(field10 AS VARCHAR) AS alias11
    , CAST(field11 AS VARCHAR) AS alias12
    , CAST(field12 AS VARCHAR) AS alias13
  FROM cte_example_sheets
)

SELECT * FROM cte_example
)

The problem is that the exact same query runs without error when I use the Trino CLI as the client, but it returns an error every time I use the DBT client.