Closed IvoMoor closed 1 month ago
Still wrong for this case:
JTdCJTIydmlld0NsYXNzJTIyJTNBJTIyUGl2b3RUYWJsZVVpJTIyJTJDJTIycXVlcnlNb2RlbCUyMiUzQSU3QiUyMmRhdGFzb3VyY2VJZCUyMiUzQSUyMmZpbGUlM0ElNUMlMjJmaHZodl90cmlwZGF0YV8yMDIzLTAxLnBhcnF1ZXQlNUMlMjIlMjIlMkMlMjJjZWxsc0hlYWRlcnMlMjIlM0ElMjJjb2x1bW5zJTIyJTJDJTIyYXhlcyUyMiUzQSU3QiUyMmNlbGxzJTIyJTNBJTVCJTdCJTIyY29sdW1uJTIyJTNBJTIyKiUyMiUyQyUyMmNvbHVtblR5cGUlMjIlM0ElMjJJTlRFR0VSJTIyJTJDJTIyZGVyaXZhdGlvbiUyMiUzQW51bGwlMkMlMjJhZ2dyZWdhdG9yJTIyJTNBJTIyY291bnQlMjIlN0QlNUQlMkMlMjJjb2x1bW5zJTIyJTNBJTVCJTdCJTIyY29sdW1uJTIyJTNBJTIyZGlzcGF0Y2hpbmdfYmFzZV9udW0lMjIlMkMlMjJjb2x1bW5UeXBlJTIyJTNBJTIyVkFSQ0hBUiUyMiUyQyUyMmRlcml2YXRpb24lMjIlM0FudWxsJTJDJTIyYWdncmVnYXRvciUyMiUzQW51bGwlN0QlMkMlN0IlMjJjb2x1bW4lMjIlM0ElMjJodmZoc19saWNlbnNlX251bSUyMiUyQyUyMmNvbHVtblR5cGUlMjIlM0ElMjJWQVJDSEFSJTIyJTJDJTIyZGVyaXZhdGlvbiUyMiUzQW51bGwlMkMlMjJhZ2dyZWdhdG9yJTIyJTNBbnVsbCUyQyUyMmluY2x1ZGVUb3RhbHMlMjIlM0F0cnVlJTdEJTVEJTJDJTIyZmlsdGVycyUyMiUzQSU1QiU3QiUyMmNvbHVtbiUyMiUzQSUyMmRpc3BhdGNoaW5nX2Jhc2VfbnVtJTIyJTJDJTIyY29sdW1uVHlwZSUyMiUzQSUyMlZBUkNIQVIlMjIlMkMlMjJkZXJpdmF0aW9uJTIyJTNBbnVsbCUyQyUyMmFnZ3JlZ2F0b3IlMjIlM0FudWxsJTJDJTIyZmlsdGVyJTIyJTNBJTdCJTIyZmlsdGVyVHlwZSUyMiUzQSUyMmluJTIyJTJDJTIydmFsdWVzJTIyJTNBJTdCJTIyQjAyNzY0JTIyJTNBJTdCJTIydmFsdWUlMjIlM0ElMjJCMDI3NjQlMjIlMkMlMjJsYWJlbCUyMiUzQSUyMkIwMjc2NCUyMiUyQyUyMmxpdGVyYWwlMjIlM0ElMjInQjAyNzY0JyUyMiU3RCUyQyUyMkIwMjc2NSUyMiUzQSU3QiUyMnZhbHVlJTIyJTNBJTIyQjAyNzY1JTIyJTJDJTIybGFiZWwlMjIlM0ElMjJCMDI3NjUlMjIlMkMlMjJsaXRlcmFsJTIyJTNBJTIyJ0IwMjc2NSclMjIlN0QlMkMlMjJCMDI4MzUlMjIlM0ElN0IlMjJ2YWx1ZSUyMiUzQSUyMkIwMjgzNSUyMiUyQyUyMmxhYmVsJTIyJTNBJTIyQjAyODM1JTIyJTJDJTIybGl0ZXJhbCUyMiUzQSUyMidCMDI4MzUnJTIyJTdEJTdEJTJDJTIydG9WYWx1ZXMlMjIlM0ElN0IlN0QlMkMlMjJ0b2dnbGVTdGF0ZSUyMiUzQSUyMmNsb3NlZCUyMiU3RCU3RCU1RCUyQyUyMnJvd3MlMjIlM0ElNUIlN0IlMjJjb2x1bW4lMjIlM0ElMjJvcmlnaW5hdGluZ19iYXNlX251bSUyMiUyQyUyMmNvbHVtblR5cGUlMjIlM0ElMjJWQVJDSEFSJTIyJTJDJTIyZGVyaXZhdGlvbiUyMiUzQW51bGwlMkMlMjJhZ2dyZWdhdG9yJTIyJTNBbnVsbCUyQyUyMmluY2x1ZGVUb3RhbHMlMjIlM0F0cnVlJTdEJTVEJTdEJTdEJTdE
Note that the row axis does not have that issue:
The rows axis has another issue though:
In this case, we should have had 3 total rows on rows axis, one for each dispatching base num (which would have had the same value as the cell value since there is only one license number. See the example above, just imagine the hvfhs_license_num and dispatching_base_num would be switched.
JTdCJTIydmlld0NsYXNzJTIyJTNBJTIyUGl2b3RUYWJsZVVpJTIyJTJDJTIycXVlcnlNb2RlbCUyMiUzQSU3QiUyMmRhdGFzb3VyY2VJZCUyMiUzQSUyMmZpbGUlM0ElNUMlMjJmaHZodl90cmlwZGF0YV8yMDIzLTAxLnBhcnF1ZXQlNUMlMjIlMjIlMkMlMjJjZWxsc0hlYWRlcnMlMjIlM0ElMjJjb2x1bW5zJTIyJTJDJTIyYXhlcyUyMiUzQSU3QiUyMmNlbGxzJTIyJTNBJTVCJTdCJTIyY29sdW1uJTIyJTNBJTIyKiUyMiUyQyUyMmNvbHVtblR5cGUlMjIlM0ElMjJJTlRFR0VSJTIyJTJDJTIyZGVyaXZhdGlvbiUyMiUzQW51bGwlMkMlMjJhZ2dyZWdhdG9yJTIyJTNBJTIyY291bnQlMjIlN0QlNUQlMkMlMjJjb2x1bW5zJTIyJTNBJTVCJTdCJTIyY29sdW1uJTIyJTNBJTIyb3JpZ2luYXRpbmdfYmFzZV9udW0lMjIlMkMlMjJjb2x1bW5UeXBlJTIyJTNBJTIyVkFSQ0hBUiUyMiUyQyUyMmRlcml2YXRpb24lMjIlM0FudWxsJTJDJTIyYWdncmVnYXRvciUyMiUzQW51bGwlMkMlMjJpbmNsdWRlVG90YWxzJTIyJTNBdHJ1ZSU3RCU1RCUyQyUyMmZpbHRlcnMlMjIlM0ElNUIlN0IlMjJjb2x1bW4lMjIlM0ElMjJkaXNwYXRjaGluZ19iYXNlX251bSUyMiUyQyUyMmNvbHVtblR5cGUlMjIlM0ElMjJWQVJDSEFSJTIyJTJDJTIyZGVyaXZhdGlvbiUyMiUzQW51bGwlMkMlMjJhZ2dyZWdhdG9yJTIyJTNBbnVsbCUyQyUyMmZpbHRlciUyMiUzQSU3QiUyMmZpbHRlclR5cGUlMjIlM0ElMjJpbiUyMiUyQyUyMnZhbHVlcyUyMiUzQSU3QiUyMkIwMjc2NCUyMiUzQSU3QiUyMnZhbHVlJTIyJTNBJTIyQjAyNzY0JTIyJTJDJTIybGFiZWwlMjIlM0ElMjJCMDI3NjQlMjIlMkMlMjJsaXRlcmFsJTIyJTNBJTIyJ0IwMjc2NCclMjIlN0QlMkMlMjJCMDI3NjUlMjIlM0ElN0IlMjJ2YWx1ZSUyMiUzQSUyMkIwMjc2NSUyMiUyQyUyMmxhYmVsJTIyJTNBJTIyQjAyNzY1JTIyJTJDJTIybGl0ZXJhbCUyMiUzQSUyMidCMDI3NjUnJTIyJTdEJTJDJTIyQjAyODM1JTIyJTNBJTdCJTIydmFsdWUlMjIlM0ElMjJCMDI4MzUlMjIlMkMlMjJsYWJlbCUyMiUzQSUyMkIwMjgzNSUyMiUyQyUyMmxpdGVyYWwlMjIlM0ElMjInQjAyODM1JyUyMiU3RCU3RCUyQyUyMnRvVmFsdWVzJTIyJTNBJTdCJTdEJTJDJTIydG9nZ2xlU3RhdGUlMjIlM0ElMjJvcGVuJTIyJTdEJTdEJTVEJTJDJTIycm93cyUyMiUzQSU1QiU3QiUyMmNvbHVtbiUyMiUzQSUyMmh2ZmhzX2xpY2Vuc2VfbnVtJTIyJTJDJTIyY29sdW1uVHlwZSUyMiUzQSUyMlZBUkNIQVIlMjIlMkMlMjJkZXJpdmF0aW9uJTIyJTNBbnVsbCUyQyUyMmFnZ3JlZ2F0b3IlMjIlM0FudWxsJTJDJTIyaW5jbHVkZVRvdGFscyUyMiUzQXRydWUlN0QlMkMlN0IlMjJjb2x1bW4lMjIlM0ElMjJkaXNwYXRjaGluZ19iYXNlX251bSUyMiUyQyUyMmNvbHVtblR5cGUlMjIlM0ElMjJWQVJDSEFSJTIyJTJDJTIyZGVyaXZhdGlvbiUyMiUzQW51bGwlMkMlMjJhZ2dyZWdhdG9yJTIyJTNBbnVsbCU3RCU1RCU3RCU3RCU3RA==
Looks like we actually generate the wrong query in this case:
SELECT GROUPING_ID( "dispatching_base_num","hvfhs_license_num" ) AS "__huey_grouping_id"
,"dispatching_base_num" AS "dispatching_base_num"
,"hvfhs_license_num" AS "hvfhs_license_num"
,COUNT(*) OVER ()
FROM read_parquet( "C:\roland\projects\QuaQuery\files\fhvhv_tripdata_2023-01.parquet" )
WHERE ( "dispatching_base_num" IN ( 'B02764'
,'B02765'
,'B02835' ) )
GROUP BY GROUPING SETS(
(
"dispatching_base_num"
), (
"dispatching_base_num"
, "hvfhs_license_num"
))
ORDER BY "dispatching_base_num" ASC
,"hvfhs_license_num" ASC
,"__huey_grouping_id" ASC
This is what we should have been generating:
SELECT GROUPING_ID( "hvfhs_license_num","dispatching_base_num" ) AS "__huey_grouping_id"
,"hvfhs_license_num" AS "hvfhs_license_num"
,"dispatching_base_num" AS "dispatching_base_num"
,COUNT(*) OVER ()
FROM read_parquet( "C:\roland\projects\QuaQuery\files\fhvhv_tripdata_2023-01.parquet" )
WHERE ( "dispatching_base_num" IN ( 'B02764'
,'B02765'
,'B02835' ) )
GROUP BY GROUPING SETS(
(
"dispatching_base_num"
), (
"hvfhs_license_num"
, "dispatching_base_num"
)
)
ORDER BY
"dispatching_base_num" ASC
,"hvfhs_license_num" ASC
,"__huey_grouping_id" ASC
Looks like our logic for creating the group sets is just wrong. What I think we should do is:
I think in a going forward fashion we could do: if we encounter an item that includes totals, make a new grouping set that has all items appearing before the includeTotals item, and only all non-includeTotal items appearing after it.