BlazingDB / blazingsql

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
https://blazingsql.com
Apache License 2.0
1.92k stars 181 forks source link

[BUG] Add temporary DECIMAL support by using float64 instead #1588

Closed gcca closed 2 years ago

gcca commented 2 years ago

If we want to add temporary DECIMAL support we could do so by making all DECIMAL64 data be read and interpreted as float64.

The cudf orc and parquet file readers have an option that will interpret decimals as floats. We should use that which should make it so that all parse_batch and parse_schema would always just read as float.

If you create a table using a cudf table, we should just throw an error if any column is of decimal type. This should happen in the create_table phase, which means we may want to do this in pyblazing.

If you create a table from hive cursor, similarly if there is a decimal type, we should just throw an error.

Need to double check how and if we handle parquet and orc metadata for DECIMAL.

For csv and json, we should ensure a user cant specify a DECIMAL column.