apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.3k stars 13.68k forks source link

Apache Superset - Filtering in Subqueries #4181

Closed kartikshanbhag3333 closed 6 years ago

kartikshanbhag3333 commented 6 years ago

Make sure these boxes are checked before submitting your issue - thank you!

Superset version

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Superset 0.22.1 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Expected results

So, I have a table with following userWeight and have to define a metric called Reach(%). Now Reach for a service is defined as

SUM(userWeights for the service)/SUM(All userWeights for Distinct Users)

The workaround I found out was SUM(DISTINCT userWeights)/(SELECT SUM(DISTINCT userWeights) FROM Table_Name)

Now this works great at the Overall Level, But I have a Filter Box for Age Group, Gender, and City. So, Ideally, I would want

SUM(DISTINCT userWeights) / (SELECT SUM(DISTINCT userWeights) FROM Table_Name WHERE [Conditions From Filter Box])

Actual results

What is happening is SELECT service,SUM(DISTINCT userWeights) / (SELECT SUM(DISTINCT userWeights) FROM Table_name) WHERE gender = "Male" and age_group = "24 - 30 yrs"

Which Produces Wrong results for the required Metric. I know I can do it Via SQL Lab from Jinja Templating. But I don't know how to do it

  1. I am less aware of Jinja templating and particularly how do we do it for Superset.

  2. I am getting following stacktrace while running a query

Sorry, something went wrong

500 - Internal Server Error


Stacktrace


 Traceback (most recent call last): File "/usr/local/anaconda2/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app response = self.full_dispatch_request() File "/usr/local/anaconda2/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/anaconda2/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/anaconda2/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/anaconda2/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/usr/local/anaconda2/lib/python2.7/site-packages/flask_appbuilder/security/decorators.py", line 52, in wraps return f(self, *args, **kwargs) File "/usr/local/anaconda2/lib/python2.7/site-packages/superset/models/core.py", line 870, in wrapper value = f(*args, **kwargs) File "/usr/local/anaconda2/lib/python2.7/site-packages/superset/views/core.py", line 2145, in sql_json session.flush() File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2139, in flush self._flush(objects) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2259, in _flush transaction.rollback(_capture_exception=True) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__ compat.reraise(exc_type, exc_value, exc_tb) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2223, in _flush flush_context.execute() File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 389, in execute rec.execute(self) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 548, in execute uow File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 181, in save_obj mapper, table, insert) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 835, in _emit_insert_statements execute(statement, params) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 945, in execute return meth(self, multiparams, params) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement compiled_sql, distilled_params File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context context) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1402, in _handle_dbapi_exception exc_info File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, cause=cause) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context context) File "/usr/local/anaconda2/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute cursor.execute(statement, parameters) IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: query.client_id [SQL: u'INSERT INTO "query" (client_id, database_id, tmp_table_name, user_id, status, tab_name, sql_editor_id, schema, sql, select_sql, executed_sql, "limit", limit_used, select_as_cta, select_as_cta_used, progress, rows, error_message, results_key, start_time, start_running_time, end_time, end_result_backend_time, tracking_url, changed_on) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)'] [parameters: (u'SkQME9zNf', 12, u'', 5, u'pending', u'Untitled Query', u'By60eA-EG', u'smartmeter_pbi_dot11', u'SELECT `userId`,\r\n `DPD`,\r\n `SPD`,\r\n `Weights`,\r\n `primaryCategory`,\r\n `secondaryCategory`,\r\n `terti ... (9 characters truncated) ... ry`,\r\n `Gender`,\r\n `AgeGroup`,\r\n `City`,\r\n `AmazonPref`\r\nFROM smartmeter_pbi_dot11.AmazonTestDashboard\r\nLIMIT 100', None, None, 1000000, 0, 0, 0, 0, None, None, None, 1515524970267.271, None, None, None, None, '2018-01-09 19:09:30.270338')] 

Please help me because its kinda a SOS situation.

Steps to reproduce

xrmx commented 6 years ago

You have to post the full stacktrace

kartikshanbhag3333 commented 6 years ago

@xrmx Sorry for the trouble, but I have fixed it by using Jinja Templating, Following is the code for your or anyone's reference. Let me know the thoughts on this.

SELECT COUNT(DISTINCT userId)/
  (SELECT COUNT(DISTINCT userId)
   FROM AppUsageFactSuperset
   WHERE 0=0
   {% set filters = form_data.get('filters') %}

   {% if filters %}
        {% for eachFilter in filters %}
            {% if eachFilter %}
                {% set column_name = eachFilter.get('col') %}
                {% set values = eachFilter.get('val') %}
                {% set operator = eachFilter.get('op') %}
                {% set joined_values = "'" + "','".join(values) + "'" %}

                AND {{ column_name }} {{ operator }} ( {{ joined_values }} )

           {% endif %}
           {% endfor %}
           {% endif %}
   ) AS `Reach_Temp`
FROM smartmeter_pbi_dot11.`AppUsageFactSuperset`
WHERE 0 = 0
{% set filters = form_data.get('filters') %}

   {% if filters %}
        {% for eachFilter in filters %}
            {% if eachFilter %}
                {% set column_name = eachFilter.get('col') %}
                {% set values = eachFilter.get('val') %}
                {% set operator = eachFilter.get('op') %}
                {% set joined_values = "'" + "','".join(values) + "'" %}

                AND {{ column_name }} {{ operator }} ( {{ joined_values }} ) 

           {% endif %}
           {% endfor %}
           {% endif %}
ORDER BY `Reach_Temp` DESC
LIMIT 50000

Also, The tool is amazing, Thank you for creating an amazing tool for Analytics and Viz.

saurabh1-singh commented 4 years ago

Hey @kartikshanbhag3333 can you help me it in this, I am too trying to use jinja template but so far have not succeeded. The problem is i create a view just like yours but when i apply filters the values are filled correctly but it also tries to apply the filter after wrapping up the query in an query.

somewhat like this select from (select from abc where {filter_values}) as expr where {filter_values}

How can I avoid the outer filters

abhishekkumaresan commented 4 years ago

@xrmx Hi, My SQL query for a chart involves subqueries, I need the filter values chosen in the filter box (in the dashboard) in the subquery also, can I pass it by any means like Jinja templating, pls explain.

I understood it with https://github.com/apache/incubator-superset/issues/8183#issuecomment-536097091 thank you, sorry to disturb .