dask-contrib / dask-sql

Distributed SQL Engine in Python using Dask
https://dask-sql.readthedocs.io/
MIT License
397 stars 72 forks source link

[BUG]] [GPU Logic Bug] "SELECT ((1) NOT BETWEEN (CASE ((<column>)) WHEN (1) THEN 0 END ) AND (<column>)) FROM <table>" brings Error #1233

Open qwebug opened 1 year ago

qwebug commented 1 year ago

What happened:

"SELECT ((1) NOT BETWEEN (CASE ((\)) WHEN (1) THEN 0 END ) AND (\)) FROM \

" brings different results, when using CPU and GPU.

What you expected to happen:

It is the same result, when using CPU and GPU.

Minimal Complete Verifiable Example:

import pandas as pd
import dask.dataframe as dd
from dask_sql import Context

c = Context()

df0 = pd.DataFrame({
    'c0': [True],
    'c1': [3666.0000],
    'c2': [0.3820597044277436],
})
t0 = dd.from_pandas(df0, npartitions=1)

c.create_table('t0', t0, gpu=False)
c.create_table('t0_gpu', t0, gpu=True)

print('CPU Result:')
result1= c.sql("SELECT ((1) NOT BETWEEN (CASE ((t0.c1)) WHEN (1) THEN 0 END ) AND (t0.c0)) FROM t0").compute()
print(result1)

print('GPU Result:')
result2= c.sql("SELECT ((1) NOT BETWEEN (CASE ((t0_gpu.c1)) WHEN (1) THEN 0 END ) AND (t0_gpu.c0)) FROM t0_gpu").compute()
print(result2)

Result:

INFO:numba.cuda.cudadrv.driver:init
CPU Result:
   Int64(1) NOT BETWEEN CASE t0.c1 WHEN Int64(1) THEN Int64(0) END AND t0.c0
0                                              False                        
GPU Result:
  Int64(1) NOT BETWEEN CASE t0_gpu.c1 WHEN Int64(1) THEN Int64(0) END AND t0_gpu.c0
0                                               <NA>                               
INFO:numba.cuda.cudadrv.driver:add pending dealloc: module_unload ? bytes

Anything else we need to know?:

Environment:

qwebug commented 5 months ago

This problem came up at dask-sql version: 2023.6.0 . And it has been fixed at dask-sql version: 2024.3.0, after my verification. Thanks to the developers for their contributions.