AlaSQL / alasql

AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.
http://alasql.org
MIT License
7.04k stars 659 forks source link

Data extraction issue: Query does not return the expected result. #1968

Open hotsun1508 opened 2 months ago

hotsun1508 commented 2 months ago

Hi, I'm expecting different results when executing the query below. There seems to be an issue with data extraction. (I've checked that there's no problem with CASE WHEN usage: https://github.com/AlaSQL/alasql/wiki/CASE)

SELECT 
    CASE 
        WHEN [age] BETWEEN 0 AND 9 THEN '0-9'
        WHEN [age] BETWEEN 10 AND 19 THEN '10-19'
        WHEN [age] BETWEEN 20 AND 29 THEN '20-29'
        WHEN [age] BETWEEN 30 AND 39 THEN '30-39'
        WHEN [age] BETWEEN 40 AND 49 THEN '40-49'
        WHEN [age] BETWEEN 50 AND 59 THEN '50-59'
        WHEN [age] BETWEEN 60 AND 69 THEN '60-69'
        WHEN [age] BETWEEN 70 AND 79 THEN '70-79'
        WHEN [age] BETWEEN 80 AND 89 THEN '80-89'
        WHEN [age] BETWEEN 90 AND 99 THEN '90-99'
        ELSE '100+'
    END AS [age_group],
    COUNT(*) AS [customer_count]
FROM 
    [temptable]
WHERE 
    [age] IS NOT NULL
GROUP BY 
    [age_group]
ORDER BY 
    [age_group];

The data type of the age column is numeric, and the expected result when extracting this data is as follows.

agegroup    customer_count
20-29   6928
30-39   19188
40-49   16326
50-59   9272
60-69   3755
70-79   495
80-89   61
90-99   21

FYI, I've got this result by executing the same query in PostgreSQL.

SELECT 
    CASE 
        WHEN age BETWEEN 0 AND 9 THEN '0-9'
        WHEN age BETWEEN 10 AND 19 THEN '10-19'
        WHEN age BETWEEN 20 AND 29 THEN '20-29'
        WHEN age BETWEEN 30 AND 39 THEN '30-39'
        WHEN age BETWEEN 40 AND 49 THEN '40-49'
        WHEN age BETWEEN 50 AND 59 THEN '50-59'
        WHEN age BETWEEN 60 AND 69 THEN '60-69'
        WHEN age BETWEEN 70 AND 79 THEN '70-79'
        WHEN age BETWEEN 80 AND 89 THEN '80-89'
        WHEN age BETWEEN 90 AND 99 THEN '90-99'
        ELSE '100+'
    END AS agegroup,
    COUNT(*) AS customer_count
FROM 
    adventureworks."adventureworks-total"
WHERE 
   age IS NOT NULL
GROUP BY 
   agegroup
ORDER BY 
   agegroup;

The query isn't returning the expected results when I run it. Could you help me out? Thank you so much! 😊

mathiasrw commented 2 months ago

Very interesting.

 SELECT 
        CASE 
            WHEN age BETWEEN 0 AND 9 THEN '0-9'
            WHEN age BETWEEN 10 AND 19 THEN '10-19'
            WHEN age BETWEEN 20 AND 29 THEN '20-29'
            ELSE '30+'
        END AS age_group 

    FROM ? 
    --GROUP BY age_group 

Gives the expected result (see https://jsfiddle.net/bpkesrh2/ ) and if I want to group across the case column it returns "30+" (see https://jsfiddle.net/1jLnxdmz/1/ ) like you have uncovered.

BUT!!!

If I remove any values over 30 (see https://jsfiddle.net/1jLnxdmz/2/ ) it STILL returns 30+ when adding the group by back in (see https://jsfiddle.net/1jLnxdmz/3/ )

This indicate that the group by logic somehow fetches the value in theELSE clause - no matter the content actually selected.