heavyai / pymapd

Python client for OmniSci GPU-accelerated SQL engine and analytics platform
https://pymapd.readthedocs.io/en/latest/
Apache License 2.0
111 stars 50 forks source link

ValueError: Invalid shared memory key 0 when trying to create dataframe from arrow::Buffer of size 0. #262

Open wamsiv opened 5 years ago

wamsiv commented 5 years ago

Describe the bug I am trying to project columns from an empty table through a select statement. The table is empty so I should get a data frame with columns mentioned in schema with empty rows.

Steps/Code to reproduce bug

import pymapd
con = pymapd.connect(user = "admin", dbname ="omnisci", password="HyperInteractive", port=6274, host="localhost")

con.execute("Drop table if exists chelsea;")
con.execute("Create table chelsea(a int)")

empty_select_query = "select a from chelsea;"

df = con.select_ipc(empty_select_query)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-ff87db76b0ba> in <module>
----> 1 df = con.select_ipc(empty_select_query)

~/miniconda3/envs/cudf/lib/python3.6/site-packages/pymapd/connection.py in select_ipc(self, operation, parameters, first_n, release_memory)
    380 
    381         sm_buf = load_buffer(tdf.sm_handle, tdf.sm_size)
--> 382         df_buf = load_buffer(tdf.df_handle, tdf.df_size)
    383 
    384         schema = _load_schema(sm_buf[0])

~/miniconda3/envs/cudf/lib/python3.6/site-packages/pymapd/ipc.py in load_buffer(handle, size)
     46     shmid = shmget(shmkey, size, 0)
     47     if shmid == -1:
---> 48         raise ValueError("Invalid shared memory key {}".format(shmkey))
     49 
     50     # With id of shared memory segment, attach to Python process

ValueError: Invalid shared memory key 0

When a user selects something on an empty table, we simply do not send any records and are initializing an arrow buffer of size 0 and passing pointers along with the schema. ARROW_THROW_NOT_OK(arrow::AllocateBuffer(0, &serialized_records));

Expected behavior User should get an empty data frame with column from the schema, like:

> print(df)
a
---
randyzwitch commented 5 years ago

@wamsiv Is this something you can take a crack at fixing? I understand the problem you are highlighting, it's just not immediately obvious to me how to fix it

wamsiv commented 5 years ago

Sure, I can take it.