Describe the bug
When trying to create a DataFrame from a pyarrow.Table object with a nonzero number of columns, but zero rows, I encounter a panic in src/context.rs:294.
To Reproduce
>>> import datafusion as df
>>> import pyarrow as pa
>>> ctx = df.SessionContext()
>>> import pandas as pd
>>> df = pd.DataFrame({'col': []})
>>> import pyarrow as pa
>>> emptyTable = pa.Table.from_pandas(df)
>>> emptyTable
pyarrow.Table
col: double
----
col: [[]]
>>> ctx.from_arrow_table(emptyTable)
thread '<unnamed>' panicked at src/context.rs:294:37:
index out of bounds: the len is 0 but the index is 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
pyo3_runtime.PanicException: index out of bounds: the len is 0 but the index is 0
Expected behavior
I expect this to create a DataFrame with zero rows, such as the following (created via .limit(0) from a non-empty DataFrame):
>>> empty
DataFrame()
++
++
>>> empty.describe()
DataFrame()
+------------+-----+
| describe | col |
+------------+-----+
| count | 0.0 |
| null_count | 0.0 |
| mean | |
| std | |
| min | |
| max | |
| median | |
+------------+-----+
Describe the bug When trying to create a
DataFrame
from apyarrow.Table
object with a nonzero number of columns, but zero rows, I encounter a panic insrc/context.rs:294
.To Reproduce
Expected behavior I expect this to create a
DataFrame
with zero rows, such as the following (created via.limit(0)
from a non-emptyDataFrame
):Additional context