I'm getting random "Error opening rules configuration file" when using Routing Rule Engine. It only happens when I have 2 or more clients sending requests to presto-gateway.
ERROR [2022-04-04 21:03:57,026] com.lyft.data.gateway.ha.router.RoutingGroupSelector$Logger: Error opening rules configuration file, using routing group header as default.
! org.yaml.snakeyaml.parser.ParserException: while parsing a block node
! in 'reader', line 13, column 1:
!
! ^
! expected the node content, but found '<block end>'
! in 'reader', line 13, column 1:
!
! ^
!
! at org.yaml.snakeyaml.parser.ParserImpl.parseNode(ParserImpl.java:482)
! at org.yaml.snakeyaml.parser.ParserImpl.access$1300(ParserImpl.java:117)
! at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockNode.produce(ParserImpl.java:359)
! at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockSequenceEntry.produce(ParserImpl.java:506)
! at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockSequenceFirstEntry.produce(ParserImpl.java:496)
! at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:158)
! at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(ParserImpl.java:148)
! at org.yaml.snakeyaml.composer.Composer.checkNode(Composer.java:78)
! at org.yaml.snakeyaml.constructor.BaseConstructor.checkData(BaseConstructor.java:123)
! at org.yaml.snakeyaml.Yaml$1.hasNext(Yaml.java:489)
! at org.jeasy.rules.support.reader.YamlRuleDefinitionReader.loadRules(YamlRuleDefinitionReader.java:71)
! at org.jeasy.rules.support.reader.AbstractRuleDefinitionReader.read(AbstractRuleDefinitionReader.java:43)
! at org.jeasy.rules.mvel.MVELRuleFactory.createRules(MVELRuleFactory.java:100)
! at com.lyft.data.gateway.ha.router.RoutingGroupSelector.lambda$byRoutingRulesEngine$1(RoutingGroupSelector.java:42)
! at com.lyft.data.gateway.ha.handler.QueryIdCachingProxyHandler.rewriteTarget(QueryIdCachingProxyHandler.java:123)
! at com.lyft.data.proxyserver.ProxyServletImpl.rewriteTarget(ProxyServletImpl.java:65)
...
Using:
java version "1.8.0_202"
Python 3.7.13
trino-python-client 0.3.11.0
rules.yaml:
---
name: 'airflow'
description: 'if query from airflow, route to etl group'
condition: 'request.getHeader("X-Trino-Source") == "trino-python-client"'
actions:
- 'result.put("routingGroup", "adhoc")'
python client script:
#!/usr/bin/python
from trino.dbapi import connect
import time
conn = connect(
host='localhost',
port=8090,
user='user1',
catalog='system',
schema='runtime',
)
cur = conn.cursor()
while 1 < 2:
cur.execute('SELECT * from nodes limit 1')
rows = cur.fetchall()
print (rows)
Other Observations:
When the error is raised, presto-gateway routes request to default RoutingGroup "adhoc". If RoutingGroup "adhoc" does not exist, client request fails.
CPU usage keeps increasing over time on the JVM. I stopped my python clients after 30mins, but CPU usage was still at 25%. If I leave it running for days, it will eventually eat up all my CPU cores.
After further investigation, I narrowed it down to MVELRuleFactory class. I'm not a developer so I can't really explain what is happening but the following fixes the issue.
--- a/gateway-ha/src/main/java/com/lyft/data/gateway/ha/router/RoutingGroupSelector.java
+++ b/gateway-ha/src/main/java/com/lyft/data/gateway/ha/router/RoutingGroupSelector.java
@@ -34,11 +34,11 @@ public interface RoutingGroupSelector {
* to determine the right routing group.
*/
static RoutingGroupSelector byRoutingRulesEngine(String rulesConfigPath) {
- RulesEngine rulesEngine = new DefaultRulesEngine();
- MVELRuleFactory ruleFactory = new MVELRuleFactory(new YamlRuleDefinitionReader());
return request -> {
try {
+ RulesEngine rulesEngine = new DefaultRulesEngine();
+ MVELRuleFactory ruleFactory = new MVELRuleFactory(new YamlRuleDefinitionReader());
Rules rules = ruleFactory.createRules(
new FileReader(rulesConfigPath));
Facts facts = new Facts();
It ran for hours without any errors and CPU usage was stable.
If anyone can provide more input, It would be appreciated.
I'm getting random "Error opening rules configuration file" when using Routing Rule Engine. It only happens when I have 2 or more clients sending requests to presto-gateway.
Using: java version "1.8.0_202" Python 3.7.13 trino-python-client 0.3.11.0
rules.yaml:
python client script:
Other Observations:
After further investigation, I narrowed it down to MVELRuleFactory class. I'm not a developer so I can't really explain what is happening but the following fixes the issue.
It ran for hours without any errors and CPU usage was stable. If anyone can provide more input, It would be appreciated.