apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.8k stars 13.87k forks source link

fix: Pre-query normalization with custom SQL #30389

Closed michael-s-molina closed 1 month ago

michael-s-molina commented 1 month ago

SUMMARY

This PR fixes an issue with pre-query normalization when dealing with custom SQL. Previously, a custom SQL that had a label that does not match any column name would throw the following error:

superset/superset/connectors/sqla/models.py", line 1691, in _normalize_prequery_result_type
    column_ = columns_by_name[dimension]
KeyError: 'custom_sql'

Given function docs:

Convert a prequery result type to its equivalent Python type.

Some databases like Druid will return timestamps as strings, but do not perform
automatic casting when comparing these strings to a timestamp. For cases like
this we convert the value via the appropriate SQL transform.

We simply skip the conversion of custom SQL types as we cannot determine if they are temporal or not currently.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

https://github.com/user-attachments/assets/e6191c06-2d78-43a9-8d08-334453ca78d2

TESTING INSTRUCTIONS

Repro the steps shown in the video using a db engine spec with allows_subqueries = False

ADDITIONAL INFORMATION

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 83.89%. Comparing base (76d897e) to head (785069d). Report is 966 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #30389 +/- ## =========================================== + Coverage 60.48% 83.89% +23.41% =========================================== Files 1931 533 -1398 Lines 76236 38519 -37717 Branches 8568 0 -8568 =========================================== - Hits 46114 32317 -13797 + Misses 28017 6202 -21815 + Partials 2105 0 -2105 ``` | [Flag](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | | |---|---|---| | [hive](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `48.99% <0.00%> (-0.17%)` | :arrow_down: | | [javascript](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | | | [mysql](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `76.87% <100.00%> (?)` | | | [postgres](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `76.94% <100.00%> (?)` | | | [presto](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `53.50% <0.00%> (-0.31%)` | :arrow_down: | | [python](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `83.89% <100.00%> (+20.41%)` | :arrow_up: | | [sqlite](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `76.39% <100.00%> (?)` | | | [unit](https://app.codecov.io/gh/apache/superset/pull/30389/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `60.66% <100.00%> (+3.04%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

villebro commented 1 month ago

@michael-s-molina I'm trying to understand when this would happen. Could this have something to do with BaseEngineSpec.make_label_compatible changing the column alias?

michael-s-molina commented 1 month ago

@michael-s-molina I'm trying to understand when this would happen. Could this have something to do with BaseEngineSpec.make_label_compatible changing the column alias?

@villebro This happens when your engine spec has allows_subqueries = False and you apply a series limit. I updated the PR description with a video.

michael-s-molina commented 1 month ago

However, maybe we could start using some convention for annotating code that should be removed/cleaned up during breaking windows? If we don't start cleaning up stuff like this we'll never get out of the tech debt hole we're in..

@villebro We generally use the @deprecated annotation but I don't have enough context about this function to actually mark it as deprecated. Maybe @betodealmeida will know more and we could add the annotation in a follow-up if it's a valid case.

villebro commented 1 month ago

However, maybe we could start using some convention for annotating code that should be removed/cleaned up during breaking windows? If we don't start cleaning up stuff like this we'll never get out of the tech debt hole we're in..

We generally use the @deprecated annotation but I don't have enough context about this function to actually mark it as deprecated. Maybe @betodealmeida will know more and we could add the annotation in a follow-up if it's a valid case.

Good point, we can reuse that even for non-public methods. I'll keep that in mind in subsequent PRs 👍