BlazingDB / blazingsql

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
https://blazingsql.com
Apache License 2.0
1.93k stars 183 forks source link

[BUG] Unable to create table with Hive Cursor #1562

Closed lucharo closed 3 years ago

lucharo commented 3 years ago

Describe the bug I am trying the out the code to create tables in the BlazingContext with a pyhive cursor from here, to no avail. The code below runs but the tables does not get created within the blazing context:

bc.create_table('my_table', cursor, hive_table_name = 'my_table1', hive_database_name = 'my_schema')
bc.list_tables()
# returns []

I also don't get any error or anything of the sort. And the same BlazingContext works when passing hdfs locations.

Steps/Code to reproduce bug

from blazingsql import BlazingContext
from pyhive import hive

con = hive.Connection(
        host="{hive_edge_node_url}",
        username = getuser(),
        auth='KERBEROS',
        kerberos_service_name="hive",
        configuration = {'hive.execution.engine': "tez", 'tez.queue.name': "group1"}
    )

bc = BlazingContext()

bc.create_table('bliblu',
                con, 
                hive_table_name = 'my_schema',
                hive_database_name = 'my_table1')

bc.list_tables()
# returns []

Expected behavior Either an error if the table was not created succesfully or the creation of the table working properly.

Environment overview (please complete the following information)

BlazingSQL version (git hash): ff4ece0366a4d76bf533baeb03dd03bdfc5232be
BlazingSQL branch name: HEAD
BlazingSQL branch tag: v0.19.0
BlazingSQL build id: 0
BlazingSQL compiler version: GNU /usr/bin/c++ 7.5.0
BlazingSQL cuda flags: -Xcompiler -Wno-parentheses -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 --expt-extended-lambda --expt-relaxed-constexpr -Werror=cross-execution-space-call -Xcompiler -Wall,-Wno-error=deprecated-declarations --default-stream=per-thread -DHT_DEFAULT_ALLOCATOR
BlazingSQL Operating system kernel: Linux-5.4.0-1038-aws
BlazingSQL Operating system architecture: x86_64
BlazingSQL Linux Operating system release: NAME=Ubuntu|VERSION=16.04.7 LTS (Xenial Xerus)|ID=ubuntu|ID_LIKE=debian|PRETTY_NAME=Ubuntu 16.04.7 LTS|VERSION_ID=16.04|HOME_URL=http://www.ubuntu.com/|SUPPORT_URL=http://help.ubuntu.com/|BUG_REPORT_URL=http://bugs.launchpad.net/ubuntu/|VERSION_CODENAME=xenial|UBUNTU_CODENAME=xenial
None

Environment details Please run and paste the output of the print_env.sh script here, to gather any other relevant environment details

Additional context Add any other context about the problem here.

----For BlazingSQL Developers---- Suspected source of the issue Where and what are potential sources of the issue

Other design considerations What components of the engine could be affected by this?

lucharo commented 3 years ago

My bad, I was using hive.Connection which returns an object of type pyhive.hive.Connection and the documentation uses hive.connect which returns an object of type pyhive.hive.Cursor, the cursor can be fetched from the connection by running con.cursor()