apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.71k stars 13.85k forks source link

[box plot] INT/NUMERIC types → DB engine Error, No numeric types to aggregate #16975

Closed rumbin closed 1 year ago

rumbin commented 3 years ago

While the ECharts Box Plot works with numeric columns of type FLOAT, a "no numeric types to aggregate" error is thrown for INT and NUMERIC/NUMBER(n,m) types.

How to reproduce the bug

  1. Create a minimal example query in SQLLab and run it.
    
    -- on Snowlake:
    SELECT 
    'a' as series
    ,1 as int_value
    ,1.1::number(10,5) as numeric_value
    ,1.1::float as float_value
    ,current_timestamp as ts

-- on Postgres: SELECT 'a' as series ,1 as int_value ,1.1::numeric as numeric_value ,1.1::float as float_value ,current_timestamp as ts

2. Click **EXPLORE** for visualizing it.
3. Choose **Box Plot** as visualization type.
4. Use **ts** as time column (as suggested by default).
5. Create simple metrics of type AVG on each of the value columns, int_value, numeric_value, float_value, respectively and **RUN** the query in between.

### Expected results

All of the three columns of different numeric dtypes should be visualizable as Box Plot.

### Actual results
1. The FLOAT type column (float_value) works; it results in a Box Plot chart being displayed.
2. The INT and NUMERIC type columns throw an error: **DB engine Error. No numeric types to aggregate**

Further tracing:  
I tested different simple metrics for the different dtypes:
* float: all simple metrics are fine
* int: 
   * AVG is bad
   * SUM, MIN, MAX is fine
   * COUNT, CUNT DISTINCT is fine
* numeric/number(n,m):
   * AVG, MIN, MAX, SUM is bad
   * COUNT, CUNT DISTINCT is fine

#### Screenshots/Screencasts
![image](https://user-images.githubusercontent.com/1220356/135979706-e9a6d7a4-2cb6-44e9-a8e9-2849102496fe.png)

### Environment

- browser type and version: Chrome 93.0.4577.63
- superset version: 1.3.1, installed via pip
- python version: 3.8.11
- node.js version: v4.6.1
- feature flags active:
"THUMBNAILS": True,
"ALERT_REPORTS": True,
"ALERTS_ATTACH_REPORTS": True,
"SQLLAB_BACKEND_PERSISTENCE": True,
"ENABLE_TEMPLATE_PROCESSING": True,                                                                 
"DASHBOARD_NATIVE_FILTERS": True, 
"DASHBOARD_CROSS_FILTERS": True,
"DASHBOARD_NATIVE_FILTERS_SET": True,
"ENABLE_EXPLORE_DRAG_AND_DROP": True,
"DASHBOARD_CACHE": True     

### Checklist

Make sure to follow these steps before submitting your issue - thank you!

- [ x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
- [ x] I have reproduced the issue with at least the latest released version of superset.
- [ x] I have checked the issue tracker for the same issue and I haven't found one similar.

### Additional context

´pip freeze´

aiohttp==3.7.4.post0 alembic==1.7.3 amqp==2.6.1 apache-superset==1.3.1 apispec==3.3.2 asn1crypto==1.4.0 async-timeout==3.0.1 attrs==21.2.0 azure-common==1.1.27 azure-core==1.18.0 azure-storage-blob==12.9.0 Babel==2.9.1 backoff==1.11.1 billiard==3.6.4.0 bleach==3.3.1 boto3==1.18.51 botocore==1.21.51 Brotli==1.0.9 cachelib==0.1.1 cachetools==4.2.4 celery==4.4.7 certifi==2021.5.30 cffi==1.14.6 chardet==4.0.0 charset-normalizer==2.0.6 click==7.1.2 cmdstanpy==0.9.68 colorama==0.4.4 convertdate==2.3.2 cron-descriptor==1.2.24 croniter==1.0.15 cryptography==3.4.8 cx-Oracle==8.2.1 cycler==0.10.0 Cython==0.29.24 defusedxml==0.7.1 deprecation==2.1.0 dnspython==2.1.0 elasticsearch==7.13.4 elasticsearch-dbapi==0.2.6 email-validator==1.1.3 ephem==4.1 et-xmlfile==1.1.0 Flask==1.1.4 Flask-AppBuilder==3.3.3 Flask-Babel==1.0.0 Flask-Caching==1.10.1 Flask-Compress==1.10.1 Flask-JWT-Extended==3.25.1 Flask-Login==0.4.1 Flask-Migrate==3.1.0 Flask-OpenID==1.3.0 Flask-SQLAlchemy==2.5.1 flask-talisman==0.8.1 Flask-WTF==0.14.3 future==0.18.2 geographiclib==1.52 geopy==2.2.0 gevent==21.8.0 google-api-core==2.0.1 google-auth==2.2.1 google-cloud-bigquery==2.27.1 google-cloud-core==2.0.0 google-crc32c==1.2.0 google-resumable-media==2.0.3 googleapis-common-protos==1.53.0 graphlib-backport==1.0.3 greenlet==1.1.2 grpcio==1.41.0 gunicorn==20.0.4 hdbcli==2.10.13 holidays==0.10.3 humanize==3.11.0 idna==3.2 importlib-resources==5.2.2 isodate==0.6.0 itsdangerous==1.1.0 Jinja2==2.11.3 jmespath==0.10.0 jsonschema==3.2.0 kiwisolver==1.3.2 kombu==4.6.11 korean-lunar-calendar==0.2.1 LunarCalendar==0.0.9 Mako==1.1.5 Markdown==3.3.4 MarkupSafe==2.0.1 marshmallow==3.13.0 marshmallow-enum==1.5.1 marshmallow-sqlalchemy==0.23.1 matplotlib==3.4.3 msgpack==1.0.2 msrest==0.6.21 multidict==5.1.0 numpy==1.21.2 oauthlib==3.1.1 openpyxl==3.0.9 oscrypto==1.2.1 packaging==21.0 pandas==1.2.5 parsedatetime==2.6 pgsanity==0.2.9 Pillow==8.3.2 polyline==1.4.0 prison==0.2.1 prophet==1.0 proto-plus==1.19.2 protobuf==3.18.0 psycopg2-binary==2.8.6 pyarrow==4.0.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pybigquery==0.10.2 pycparser==2.20 pycryptodomex==3.10.4 pyhdb==0.3.4 PyJWT==1.7.1 PyMeeus==0.5.11 pyOpenSSL==20.0.1 pyparsing==2.4.7 pyrsistent==0.18.0 pystan==2.18.0.0 python-dateutil==2.8.2 python-dotenv==0.19.0 python-geohash==0.8.5 python-ldap==3.3.1 python3-openid==3.2.0 pytz==2021.1 PyYAML==5.4.1 redis==3.5.3 requests==2.26.0 requests-oauthlib==1.3.0 rsa==4.7.2 s3transfer==0.5.0 selenium==3.141.0 setuptools-git==1.2 simplejson==3.17.5 six==1.16.0 slackclient==2.5.0 snowflake-connector-python==2.6.2 snowflake-sqlalchemy==1.2.4 SQLAlchemy==1.3.24 sqlalchemy-hana==0.5.0 SQLAlchemy-Utils==0.36.8 sqlparse==0.3.0 tabulate==0.8.9 tqdm==4.62.3 typing-extensions==3.10.0.2 ujson==4.2.0 urllib3==1.26.7 vine==1.3.0 webencodings==0.5.1 Werkzeug==1.0.1 WTForms==2.3.3 WTForms-JSON==0.3.3 xlrd==2.0.1 yarl==1.6.3 zipp==3.6.0 zope.event==4.5.0 zope.interface==5.4.0



There is no stack trace in the superset.log available.
junlincc commented 3 years ago

Thank you @rumbin for detailed issue filing, which is really helpful.

@villebro , please help @stephenLYZ to work on a fix. I don't think it's a regression, did we not support numeric type before?

rumbin commented 2 years ago

Update: This issue still exists in Superset 1.4

villebro commented 2 years ago

Repro'd a variation of this on master

villebro commented 2 years ago

This appears to be the culprit: https://github.com/numpy/numpy/issues/6339

rumbin commented 2 years ago

Not sure if the numpy issue is the culprit. On our Superset 1.0.1 deployment I cannot reproduce this bug. On 1.3.2 it is there and on 1.4, too.

villebro commented 2 years ago

That's interesting - I saw that numpy has been bumped from 1.19.4 to 1.21.1 since 1.0.1, but I tried downgrading to that and the error is still the same. When I downgraded Pandas to 1.2.5 the error message changed, but it's still an error:

(Pandas 1.3.4) image

(Pandas 1.2.5) image

Anyway, this should be a simple fix by converting Double to float, as we probably don't need the "correctly-rounded" types here - it's mostly a question of doing the conversion in the best possible place. Also, I noticed that the data type of numeric_value on Postgres isn't identified correctly, so that should also be fixed.

image

I suggest flagging this issue for 1.4.1

rumbin commented 2 years ago

@villebro For comparison: The 1.0.1 deployment where the issue is not present has the folowing pip freeze:

aiohttp==3.7.4.post0
alembic==1.5.8
amqp==2.6.1
apache-superset==1.0.1
apispec==3.3.2
asn1crypto==1.4.0
async-timeout==3.0.1
attrs==20.3.0
azure-common==1.1.26
azure-core==1.12.0
azure-storage-blob==12.8.0
Babel==2.9.0
backoff==1.10.0
billiard==3.6.3.0
bleach==3.3.0
boto3==1.17.38
botocore==1.20.38
Brotli==1.0.9
cachelib==0.1.1
cachetools==4.2.1
celery==4.4.7
certifi==2020.12.5
cffi==1.14.5
chardet==3.0.4
click==7.1.2
cmdstanpy==0.9.5
colorama==0.4.4
contextlib2==0.6.0.post1
convertdate==2.3.2
cron-descriptor==1.2.24
croniter==1.0.10
cryptography==3.4.7
cx-Oracle==8.1.0
cycler==0.10.0
Cython==0.29.22
decorator==4.4.2
defusedxml==0.7.1
dnspython==2.1.0
elasticsearch==7.12.0
elasticsearch-dbapi==0.2.1
email-validator==1.1.2
ephem==3.7.7.1
fbprophet==0.7.1
Flask==1.1.2
Flask-AppBuilder==3.2.1
Flask-Babel==1.0.0
Flask-Caching==1.10.1
Flask-Compress==1.9.0
Flask-JWT-Extended==3.25.1
Flask-Login==0.4.1
Flask-Migrate==2.7.0
Flask-OpenID==1.2.5
Flask-SQLAlchemy==2.5.1
flask-talisman==0.7.0
Flask-WTF==0.14.3
future==0.18.2
geographiclib==1.50
geopy==2.1.0
gevent==21.1.2
google-api-core==1.26.2
google-auth==1.28.0
google-cloud-bigquery==2.13.1
google-cloud-core==1.6.0
google-crc32c==1.1.2
google-resumable-media==1.2.0
googleapis-common-protos==1.53.0
greenlet==1.0.0
grpcio==1.36.1
gunicorn==20.0.4
hdbcli==2.7.26
holidays==0.10.3
humanize==3.3.0
idna==2.10
importlib-metadata==3.7.3
isodate==0.6.0
itsdangerous==1.1.0
Jinja2==2.11.3
jmespath==0.10.0
jsonschema==3.2.0
kiwisolver==1.3.1
kombu==4.6.11
korean-lunar-calendar==0.2.1
LunarCalendar==0.0.9
Mako==1.1.4
Markdown==3.3.4
MarkupSafe==1.1.1
marshmallow==3.10.0
marshmallow-enum==1.5.1
marshmallow-sqlalchemy==0.23.1
matplotlib==3.4.0
msgpack==1.0.2
msrest==0.6.21
multidict==5.1.0
numpy==1.20.1
oauthlib==3.1.0
oscrypto==1.2.1
packaging==20.9
pandas==1.1.5
parsedatetime==2.6
pathlib2==2.3.5
pgsanity==0.2.9
Pillow==8.1.2
polyline==1.4.0
prison==0.1.3
proto-plus==1.18.1
protobuf==3.15.6
psycopg2-binary==2.8.6
py==1.10.0
pyarrow==1.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pybigquery==0.5.0
pycparser==2.20
pycryptodomex==3.10.1
pyhdb==0.3.4
PyJWT==1.7.1
PyMeeus==0.5.11
pyOpenSSL==19.1.0
pyparsing==2.4.7
pyrsistent==0.17.3
pystan==2.18.0.0
python-dateutil==2.8.1
python-dotenv==0.15.0
python-editor==1.0.4
python-geohash==0.8.5
python-ldap==3.3.1
python3-openid==3.2.0
pytz==2021.1
PyYAML==5.4.1
redis==3.5.3
requests==2.25.1
requests-oauthlib==1.3.0
retry==0.9.2
rsa==4.7.2
s3transfer==0.3.6
selenium==3.141.0
setuptools-git==1.2
simplejson==3.17.2
six==1.15.0
slackclient==2.5.0
snowflake-connector-python==2.4.1
snowflake-sqlalchemy==1.2.4
SQLAlchemy==1.3.23
sqlalchemy-hana==0.5.0
SQLAlchemy-Utils==0.36.8
sqlparse==0.3.0
tqdm==4.59.0
typing-extensions==3.7.4.3
urllib3==1.26.4
vine==1.3.0
webencodings==0.5.1
Werkzeug==1.0.1
WTForms==2.3.3
WTForms-JSON==0.3.3
yarl==1.6.3
zipp==3.4.1
zope.event==4.5.0
zope.interface==5.3.0
stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

rusackas commented 1 year ago

@rumbin is this still an issue you're experiencing, and is it happening in the 2.0.1 release?

rusackas commented 1 year ago

@andrey-zayats took a look at this and was unable to reproduce the issue. I'm going to go ahead and remove the validation:required label and close the issue. If anyone sees this happening in Superset 2.0.x or later, or on master, we're open to revisiting this conversation, re-opening the ticket, or triaging a new Issue with updated context. Thank you!