druid-io / pydruid

A Python connector for Druid
Other
506 stars 194 forks source link

[Question] groupby query throwing HTTP Error 500 sometimes #233

Open ravikanth39 opened 3 years ago

ravikanth39 commented 3 years ago

Hi, Iam running below groupby query on my druid data source (which is running on kubernetes) using Pydruid . query = PyDruid('http://druid_router_ip:druid_router_port', 'druid/v2/') query.groupby({"datasource": 'INTERFACE_PERFORMANCE', "granularity": 'all', "intervals": previous_day_time + '/' + current_time, "dimensions": ["nodeLabel", "instance"], "limit_spec": {"type": "default", "columns": [{"dimension": "UtilizationIn_Max", "direction": "descending", "dimensionOrder": {"type": "numeric"}}], "limit": 500}, "aggregations": {"UtilizationIn_Max": doublemax("UtilizationIn")}}) It runs fine most of the times but for every 20-30 queries it is throwing the below error HTTP Error 500: Server Error Druid Error: {'error': 'Unknown exception', 'errorMessage': 'org.jboss.netty.channel.ChannelException: Channel disconnected', 'errorClass': 'java.util.concurrent.ExecutionException', 'host': '10.244.3.38:8100'} Query is: { "aggregations": [ { "fieldName": "UtilizationIn", "name": "UtilizationIn_Max", "type": "doubleMax" } ], "dataSource": "INTERFACE_PERFORMANCE", "dimensions": [ "nodeLabel", "instance" ], "granularity": "all", "intervals": "2020-08-25T01:32:35.725-04:00/2020-08-26T01:32:35.725-04:00", "limitSpec": { "columns": [ { "dimension": "UtilizationIn_Max", "dimensionOrder": { "type": "numeric" }, "direction": "descending" } ], "limit": 500, "type": "default" }, "queryType": "groupBy" } It shows Channel disconnected but could not find why the channel was disconnected by going through druid's logs. Dont know if the error is on Pydruid side or Iam hitting any resource limit on druid and need to change the configuration.The query runs fine if we run it on druid console. Any help is greatly appreciated and if you need any more info please do let me know