apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
61.61k stars 13.45k forks source link

Mapbox Visualizatuion error cause by "is not JSON serializable" #3604

Closed amoussoubaruch closed 6 years ago

amoussoubaruch commented 6 years ago

Make sure these boxes are checked before submitting your issue - thank you!

Superset version

0.19.1

Expected results

I try to draw mapbox in superset. I have dataset with column Latitude and Longitude and use it in respective field.

Actual results

TypeError: <superset.connectors.druid.models.DruidMetric object at 0xefbea90> is not JSON serializable

Steps to reproduce

togithub

Anyone already have the same problem? Thanks

xrmx commented 6 years ago

Please reproduce with latest version and also post the full backtrace proberly quoted in text and not in picture.

amoussoubaruch commented 6 years ago

This is full backtrace i receveid

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/superset/viz.py", line 249, in get_payload
    df = self.get_df()
  File "/usr/lib/python2.7/site-packages/superset/viz.py", line 79, in get_df
    self.results = self.datasource.query(query_obj)
  File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 971, in query
    client=client, query_obj=query_obj, phase=2)
  File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 799, in get_query_str
    return self.run_query(client=client, phase=phase, **query_obj)
  File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 962, in run_query
    client.groupby(**qry)
  File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 192, in groupby
    return self._post(query)
  File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 391, in _post
    headers, querystr, url = self._prepare_url_headers_and_body(query)
  File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 34, in _prepare_url_headers_and_body
    querystr = json.dumps(query.query_dict).encode('utf-8')
  File "/usr/lib64/python2.7/json/__init__.py", line 243, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib64/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib64/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <superset.connectors.druid.models.DruidMetric object at 0xb319650> is not JSON serializable
xrmx commented 6 years ago

Quote with 3 backticks please. And newlines.

amoussoubaruch commented 6 years ago
Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/superset/viz.py", line 249, in get_payload df = self.get_df() File "/usr/lib/python2.7/site-packages/superset/viz.py", line 79, in get_df self.results = self.datasource.query(query_obj) File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 971, in query client=client, query_obj=query_obj, phase=2) File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 799, in get_query_str return self.run_query(client=client, phase=phase, **query_obj) File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 962, in run_query client.groupby(**qry) File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 192, in groupby return self._post(query) File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 391, in _post headers, querystr, url = self._prepare_url_headers_and_body(query) File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 34, in _prepare_url_headers_and_body querystr = json.dumps(query.query_dict).encode('utf-8') File "/usr/lib64/python2.7/json/__init__.py", line 243, in dumps return _default_encoder.encode(obj) File "/usr/lib64/python2.7/json/encoder.py", line 207, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode return _iterencode(o, 0) File "/usr/lib64/python2.7/json/encoder.py", line 184, in default raise TypeError(repr(o) + " is not JSON serializable") TypeError: <superset.connectors.druid.models.DruidMetric object at 0xb319650> is not JSON serializable
amoussoubaruch commented 6 years ago

It's ok??

xrmx commented 6 years ago

No. Copy it from the server logs, it'll have the proper newlines.

amoussoubaruch commented 6 years ago
TypeError: <superset.connectors.druid.models.DruidMetric object at 0xb319650> is not JSON serializable
2017-10-05 15:09:24,759:INFO:root:Caching for the next 86400 seconds
2017-10-05 15:17:22,439:INFO:root:[stats_logger] (incr) explore
2017-10-05 15:17:23,749:INFO:root:[stats_logger] (incr) explore_json
2017-10-05 15:17:23,820:INFO:root:[stats_logger] (incr) loaded_from_source
2017-10-05 15:17:23,820:INFO:root:Serving from cache
2017-10-05 15:17:28,259:INFO:root:[stats_logger] (incr) explore_json
2017-10-05 15:17:28,332:INFO:root:[stats_logger] (incr) loaded_from_cache
2017-10-05 15:17:28,348:INFO:root:Caching for the next 86400 seconds
2017-10-05 15:17:40,590:INFO:root:[stats_logger] (incr) explore_json
2017-10-05 15:17:40,659:INFO:root:[stats_logger] (incr) loaded_from_cache
2017-10-05 15:17:40,664:ERROR:root:<superset.connectors.druid.models.DruidMetric object at 0x95a5410> is not JSON serializable
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/superset/viz.py", line 249, in get_payload
    df = self.get_df()
  File "/usr/lib/python2.7/site-packages/superset/viz.py", line 79, in get_df
    self.results = self.datasource.query(query_obj)
  File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 971, in query
    client=client, query_obj=query_obj, phase=2)
  File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 799, in get_query_str
    return self.run_query(client=client, phase=phase, **query_obj)
  File "/usr/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 962, in run_query
    client.groupby(**qry)
  File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 192, in groupby
    return self._post(query)
  File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 391, in _post
    headers, querystr, url = self._prepare_url_headers_and_body(query)
  File "/usr/lib/python2.7/site-packages/pydruid/client.py", line 34, in _prepare_url_headers_and_body
    querystr = json.dumps(query.query_dict).encode('utf-8')
  File "/usr/lib64/python2.7/json/__init__.py", line 243, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib64/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib64/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <superset.connectors.druid.models.DruidMetric object at 0x95a5410> is not JSON serializable
2017-10-05 15:17:40,665:INFO:root:Caching for the next 86400 seconds
amoussoubaruch commented 6 years ago

This is log directly receivred from server. I use druid for backend database.

amoussoubaruch commented 6 years ago

@xrmx

amoussoubaruch commented 6 years ago

@georgeke

amoussoubaruch commented 6 years ago

@xrmx

Fokko commented 6 years ago

I'm experiencing the same issue

ahsanshah commented 6 years ago

Same here. Would be great if this worked as there are many Geo coded (Lat Long) data set use cases for interactive exploration. As an aside, would something like Leaflet.js be possible with Superset?

bolkedebruin commented 6 years ago

from superset.connectors.druid.models import DruidMetric
import json
from superset.utils import base_json_conv

m = DruidMetric()
json.dumps(m, default=base_json_conv)

'null'

DruidMetric probably needs a default method from JSONEncoder.

ghost commented 6 years ago

I also have the same issue. How can we set this default method?

bolkedebruin commented 6 years ago

Digging abit deeper, unfortunately adding a "default()" method will not work. The json module has a bit of a peculiar default way of doing serialization, which does not allow serialization methods being added to classes.

However, here it seems that a DruidMetric is somewhere to being forgotten to be .json_obj() as it is done in run_query for metrics. Somehow one (or more) slips trough.

bolkedebruin commented 6 years ago

Ok I found the issue. The dimensions of the LimitSpec part of the query are populated with metrics or self.metrics. This is a druidmetric instead of a column name.

I think the logic (run_query) is wrong here and it should use dimensions instead and not metrics. I’m not entirely sure and need some help from @Fokko or @mistercrunch. Then the PR is quite simple.

Offending lines:

  1. https://github.com/apache/incubator-superset/blob/master/superset/connectors/druid/models.py#L1139
  2. https://github.com/apache/incubator-superset/blob/master/superset/connectors/druid/models.py#L1188
mistercrunch commented 6 years ago

A few tangents here:

It's been tricky to manage how aggregations work for geospatial, in many cases users do not want to aggregate and just use atomic data. In other cases people want to aggregate data. There's also the fact that with WebGL / Deck.GL we can actually afford to visualize hundreds of thousands of data points, and aggregation as necessary technically. The way the new Deck.GL visualizations work is that they don't aggregate unless you specify a metric.

I'm unclear on how to handle this original mapbox visualization in the light of the recent work we've been doing using Deck.GL. The new scatterplot viz is much better than this one, but it doesn't support the dynamic clustering use case (where clusters are created on the fly as users zoom in /out). We may want to refactor that visualization to be more like the new ones. First thing to do would be to use the spatial control for lat/long which works with proper Druid spatial dims.

bolkedebruin commented 6 years ago

@mistercrunch the fix I made is a bug fix in the spec of run_query. It should not be able to pass a DruidMetric directly to the json passed to Druid. MapBox just exposes it but it could occur in different circumstances as well.

In other words, the pr fixes this issue by fixing a fundamental bug. Ptal

mistercrunch commented 6 years ago

@bolkedebruin gotcha, let me review the PR. I just needed to gather some thoughts around what we're going to do with the Mapbox viz and used this PR as a vehicle.

bolkedebruin commented 6 years ago

Cheers. No prob

Fokko commented 6 years ago

@amoussoubaruch This issue can be closed, it has been fixed

chirpy2291 commented 6 years ago

Hi, i am facing the same issue.On running single custom dimension i get the following error: KeyError: u"None of [[u'outputName', u'extractionFn', u'type', u'outputType', u'dimension']] are in the [index]"

Can you specify which version of druid to use? and what is the fix?

Fokko commented 6 years ago

I don't think it really has to do with Druid. Which version Superset are you running?

chirpy2291 commented 6 years ago

Hi,

Using 0.20.

Two issues am facing: 1.Extensions are non filterable. 2.And single extension dimension is non groupable.It throws the error specified in the above post. Any reason why topN query doesnt work on superset for single extension dimension? Same works if i hit the same query on druid.

On Thu, Feb 1, 2018 at 5:38 PM, Fokko Driesprong notifications@github.com wrote:

I don't think it really has to do with Druid. Which version Superset are you running?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/apache/incubator-superset/issues/3604#issuecomment-362246857, or mute the thread https://github.com/notifications/unsubscribe-auth/AZHoWqyDkzi6N-Yeb5Bh7FMovUCpRVMKks5tQakvgaJpZM4Pu95g .

bolkedebruin commented 6 years ago

@chirpy2291 the fix is new (not even in a released version afaik) and you are running an old version of superset. So why not use master first and then open a new issue if needed instead of hijacking this closed issue?

chirpy2291 commented 6 years ago

Since the code is in production its not possible to include all the new features and release with the given timeframe. So sometimes when we have to go forward in a short span of time we have to look back a little. I thought you being a contributor would be able to help. Sorry for the inconvenience.

On Sun, 4 Feb 2018 at 4:58 PM, bolkedebruin notifications@github.com wrote:

@chirpy2291 https://github.com/chirpy2291 the fix is new (not even in a released version afaik) and you are running an old version of superset. So why not use master first and then open a new issue if needed instead of hijacking this closed issue?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/apache/incubator-superset/issues/3604#issuecomment-362899784, or mute the thread https://github.com/notifications/unsubscribe-auth/AZHoWq69ltZgguDB66Wp2znrgmjhDRdpks5tRZRMgaJpZM4Pu95g .