BlazingDB / blazingsql

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
https://blazingsql.com
Apache License 2.0
1.93k stars 183 forks source link

[FEA] Support LCASE alias for LOWER #1176

Open beckernick opened 3 years ago

beckernick commented 3 years ago

I'd like to be able to call LCASE on a string column to convert it to lowercase, like in MySQL. This is an alias for LOWER, which is noted in #1135 . This is listed as a supported operation on string columns in the Calcite reference, but it may need not currently be available without some changes based on the following:

from pyspark.sql import SparkSession
from blazingsql import BlazingContext
import pandas as pd
​
​
# spark = SparkSession.builder \
#     .master("local") \
#     .getOrCreate()
​
# bc = BlazingContext()
​
​
df = pd.DataFrame({
    "a": ["Felipe", "William", "Rodrigo"],
    "b": [10, 9, 8],
})
​
bc.create_table("df", df)
sdf = spark.createDataFrame(df)
sdf.createOrReplaceTempView("df")
​
query = """
SELECT
    a,
    LCASE(a)
FROM df
"""
​
spark.sql(query).show()
​
print(bc.explain(query))
# print(bc.sql(query)) # fails
+-------+--------+
|      a|lcase(a)|
+-------+--------+
| Felipe|  felipe|
|William| william|
|Rodrigo| rodrigo|
+-------+--------+

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/raid/nicholasb/miniconda3/envs/rapids-tpcxbb-20201118/lib/python3.7/site-packages/_jpype.cpython-37m-x86_64-linux-gnu.so in com.blazingdb.calcite.application.RelationalAlgebraGenerator.getRelationalAlgebraString()

/raid/nicholasb/miniconda3/envs/rapids-tpcxbb-20201118/lib/python3.7/site-packages/_jpype.cpython-37m-x86_64-linux-gnu.so in com.blazingdb.calcite.application.RelationalAlgebraGenerator.getRelationalAlgebra()

/raid/nicholasb/miniconda3/envs/rapids-tpcxbb-20201118/lib/python3.7/site-packages/_jpype.cpython-37m-x86_64-linux-gnu.so in com.blazingdb.calcite.application.RelationalAlgebraGenerator.getNonOptimizedRelationalAlgebra()

/raid/nicholasb/miniconda3/envs/rapids-tpcxbb-20201118/lib/python3.7/site-packages/_jpype.cpython-37m-x86_64-linux-gnu.so in com.blazingdb.calcite.application.RelationalAlgebraGenerator.validateQuery()

Exception: Java Exception

The above exception was the direct cause of the following exception:

com.blazingdb.calcite.application.SqlValidationExceptionTraceback (most recent call last)
/raid/nicholasb/miniconda3/envs/rapids-tpcxbb-20201118/lib/python3.7/site-packages/pyblazing/apiv2/context.py in explain(self, sql)
   1693         try:
-> 1694             algebra = self.generator.getRelationalAlgebraString(sql)
   1695 

com.blazingdb.calcite.application.SqlValidationException: com.blazingdb.calcite.application.SqlValidationException: No match found for function signature LCASE(<CHARACTER>)

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
<ipython-input-3-52fd61155fce> in <module>
     29 spark.sql(query).show()
     30 
---> 31 print(bc.explain(query))
     32 # print(bc.sql(query)) # fails

/raid/nicholasb/miniconda3/envs/rapids-tpcxbb-20201118/lib/python3.7/site-packages/pyblazing/apiv2/context.py in explain(self, sql)
   1696         except SqlValidationExceptionClass as exception:
   1697             # jpype.JException as exception:
-> 1698             raise Exception(exception.message())
   1699             # algebra = ""
   1700             # print("SQL Parsing Error")

Exception: No match found for function signature LCASE(<CHARACTER>)
derekmorr commented 3 years ago

We should probably also support ucase as a synonym for upper.

derekmorr commented 3 years ago

I can work on this.