druid-io / pydruid

A Python connector for Druid
Other
511 stars 200 forks source link

misleading error message #4

Closed jstrunk closed 10 years ago

jstrunk commented 10 years ago

Dear Deep,

I was trying out a groupby query on metrics. pyDruid told me I had a malformed query, but when I run the generated query through curl it works.

import pydruid.client
import datetime

bard_url = 'http://x.x.x.x:8080/'
endpoint = 'druid/v2/?pretty'
query = pydruid.client.pyDruid(bard_url,endpoint)

dataSource = 'mmx_metrics'
filters = (pydruid.client.Dimension("metric") == "query/time") & (pydruid.client.Dimension("service") == "druid/prod/bard")
intervals = [datetime.datetime.utcnow().isoformat() + '/PT5M']

foo = query.groupBy(dataSource=dataSource, intervals=intervals, granularity="minute", dimensions=['host','service'], aggregations = {"count": pydruid.client.doubleSum("count")}, filter=filters)

Gives me:

---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-11-9d231bdb8e44> in <module>()
----> 1 foo = query.groupBy(dataSource=dataSource, intervals=intervals, granularity="minute", dimensions=['host','service'], aggregations = {"count": pydruid.client.doubleSum("count")}, filter=filters)

/usr/lib/python2.7/site-packages/pyDruid-0.1.7-py2.7.egg/pydruid/client.pyc in groupBy(self, **args)
    157                 self.query_dict = query_dict
    158                 self.query_type = 'groupby'
--> 159                 return self.post(query_dict)
    160 
    161         def segmentMetadata(self, **args):

/usr/lib/python2.7/site-packages/pyDruid-0.1.7-py2.7.egg/pydruid/client.pyc in post(self, query)
     47                         res.close()
     48                 except urllib2.HTTPError, e:
---> 49                         raise IOError('Malformed query: \n {0}'.format(json.dumps(self.query_dict, indent = 4)))
     50                 else:
     51                         self.result = self.parse()

IOError: Malformed query: 
 {
    "dimensions": [
        "host",
        "service"
    ],
    "aggregations": [
        {
            "type": "doubleSum",
            "fieldName": "count",
            "name": "count"
        }
    ],
    "filter": {
        "fields": [
            {
                "type": "selector",
                "dimension": "metric",
                "value": "query/time"
            },
            {
                "type": "selector",
                "dimension": "service",
                "value": "druid/prod/bard"
            }
        ],
        "type": "and"
    },
    "intervals": [
        "2013-12-06T00:38:38.760172/PT5M"
    ],
    "dataSource": "mmx_metrics",
    "granularity": "minute",
    "queryType": "groupBy"
}

I put the generated query into /tmp/query.druid and ran the following:

curl -X POST "http://x.x.x.x:8080/druid/v2/?pretty" -H 'content-type: application/json' -d @/tmp/query.druid

It returned the results I expected.

I saw this with both the pip installed version and the git version.

-Jeff

dganguli commented 10 years ago

The bard url must not end in a slash. I will put a check in client.post() that strips ending slashes. Thanks for finding this bug!