BlazingDB / blazingsql

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
https://blazingsql.com
Apache License 2.0
1.92k stars 181 forks source link

[BUG]SHA hashing function not working in Blazing SQL #1587

Closed rarajamani closed 2 years ago

rarajamani commented 2 years ago

Hi All

I have a customer account number and I want to SHA2 hashing the number before loading it into the GCS bucket. I have an equivalent function working fine in the hive. I am trying to implement the same logic in Blaing SQL and I am getting an error saying that "No match found for function signature sha2()". Do we have any workaround to implement hashing function in Blazing SQL?

Hive - sha2(concat('xyz_',account_number)) as hash_acct_number

My code

from blazingsql import BlazingContext
bc = BlazingContext(dask_client=client)

custDF = cudf.read_csv('/root/document/data/customer_acct.csv', index=False)
custDF = dask_cudf.from_cudf(custDF, npartitions = 3)

bc.create_table('customer_acct',custDF)
res = bc.sql('''select acct_number,sha2(concat('xyz_',account_number)) as hash_acct_number from customer_acct''')

Error : Exception: No match found for function signature sha2(<CHARACTER>)

Sample Customer Acct record from CSV file.
customer_acct
0   12345678901
1   11212345678
2   99988887775
3   55544433322
4   21222334351
5   31323234590
felipeblazing commented 2 years ago

Sha2 is not a supported function in BlazingSQL

rarajamani commented 2 years ago

Ok, thanks.Has any plan in near future to implement this function.

felipeblazing commented 2 years ago

Not at the moment.