dask-contrib / dask-sql

Distributed SQL Engine in Python using Dask
https://dask-sql.readthedocs.io/
MIT License
376 stars 71 forks source link

[BUG] [GPU Logic Bug] SELECT (CASE <column> WHEN (CASE <column> WHEN <column> THEN <column> END) THEN <string> ELSE <column> END) FROM <table> brings Error #1340

Open qwebug opened 3 weeks ago

qwebug commented 3 weeks ago

What happened:

SELECT (CASE \ WHEN (CASE \ WHEN \ THEN \ END) THEN \ ELSE \ END) FROM \

brings different results, when using CPU and GPU execution.

What you expected to happen:

It is the same result, when using CPU and GPU execution.

Minimal Complete Verifiable Example:

import pandas as pd
import dask.dataframe as dd
from dask_sql import Context

c = Context()

t0 = dd.read_csv('t0.csv')

c.create_table('t0', t0, gpu=False)
c.create_table('t0_gpu', t0, gpu=True)

print('CPU Result:')
result1= c.sql("SELECT ( CASE ((t0.c0) ||(t0.c0)) WHEN ( CASE ((t0.c0) ||(t0.c0)) WHEN '' THEN t0.c0 END ) THEN '' ELSE (('.8Kb') ||(t0.c0)) END ) FROM t0").compute()
print(result1)

print('GPU Result:')
result2= c.sql("SELECT ( CASE ((t0_gpu.c0) ||(t0_gpu.c0)) WHEN ( CASE ((t0_gpu.c0) ||(t0_gpu.c0)) WHEN '' THEN t0_gpu.c0 END ) THEN '' ELSE (('.8Kb') ||(t0_gpu.c0)) END ) FROM t0_gpu").compute()
print(result2)

t0.csv:

c0,
'',

Result:

INFO:numba.cuda.cudadrv.driver:init
CPU Result:
  CASE t0.c0 || t0.c0 WHEN CASE t0.c0 || t0.c0 WHEN Utf8("") THEN t0.c0 END THEN Utf8("") ELSE Utf8(".8Kb") || t0.c0 END
0                                             .8Kb''                                                                    
GPU Result:
  CASE t0_gpu.c0 || t0_gpu.c0 WHEN CASE t0_gpu.c0 || t0_gpu.c0 WHEN Utf8("") THEN t0_gpu.c0 END THEN Utf8("") ELSE Utf8(".8Kb") || t0_gpu.c0 END
0           

Anything else we need to know?:

Environment: