druid-io / pydruid

A Python connector for Druid
Other
505 stars 194 forks source link

Subquery not getting converted property #293

Open veerappans opened 1 year ago

veerappans commented 1 year ago

I tried a simple subquery query:

` sub_query = druid_conn.sub_query(datasource='twitterstream_res', granularity='day', intervals='2022-01-01/2022-01-03', dimensions=[], aggregations={"first_value": doublesum("REVENUE")}, )

group_query = druid_conn.timeseries( datasource=sub_query, granularity='day', intervals='2022-01-01/2022-01-03', aggregations={"outer_final_value": doublesum("first_value")} )

df = group_query.export_pandas() df `

when I do this, I am getting an druid query syntax error. Its getting converted incorrectly. Can anyone please help.

`{ "aggregations": [ { "fieldName": "first_value", "name": "outer_final_value", "type": "doubleSum" } ], "dataSource": { "dataSources": { "query": { "aggregations": [ { "fieldName": "REVENUE", "name": "first_value", "type": "doubleSum" } ], "dataSource": "twitterstream", "granularity": "day", "intervals": "2022-01-01/2022-01-03", "queryType": "groupBy" }, "type": "query" }, "type": "union" }, "granularity": "day", "intervals": "2022-01-01/2022-01-03", "queryType": "timeseries" }

druid_conn.timeseries( datasource=sub_query, granularity='day', intervals='2022-01-01/2022-01-03', aggregations={"outer_final_value": doublesum("first_value")} ) `

Adding an extra 'union'

nbehnam commented 1 year ago

I ran into the same thing. basically the subquery returns a dict but it isn't handled properly in parse_datasource, it basically assumes that it is a list vs a dictionary and thus thinks its a list of data sources.

nbehnam commented 1 year ago

I created https://github.com/druid-io/pydruid/pull/299 which "should" fix it.