Closed jychen7 closed 2 years ago
I believe it can work using https://pyo3.rs/v0.15.1/class.html?highlight=inheri#inheritance, close now
I try both inheritance and non-inheritance, compiling works, but pytest still show error
with inheritance, ctx.register_table("weather_balloons", bigtable_table)
returns TypeError: argument 'table': 'BigtableTable' object cannot be converted to 'Table'
https://github.com/datafusion-contrib/datafusion-bigtable/blob/014d02f26800402d37638113948d07197fb7b201/python/src/datasource.rs#L11-L12
without inheritance, ctx.register_table("weather_balloons", bigtable_table.to_pytable())
returns TypeError: argument 'table': 'Table' object cannot be converted to 'Table'
https://github.com/datafusion-contrib/datafusion-bigtable/blob/fb2c794a33b5ee9234f7a9e24f2afebc7e17a7fb/python/src/datasource.rs#L56-L58
I have tried register_csv
then use the PyTable to register_table
as t1
, it works. The weird thing is in following log, both t1
and t2
have same class/type, but t2
will fail register_table
(Pdb) ctx.register_csv("temp", "/path/to/temp.csv")
(Pdb) t1 = ctx.catalog().database("public").table("temp")
(Pdb) t1
<datafusion.Table object at 0x1055086f0>
(Pdb) ctx.register_table("t1", t1)
(Pdb) ctx.tables()
{'t1', 'temp'}
(Pdb) t2 = bigtable_table.to_pytable()
(Pdb) t2
<datafusion.Table object at 0x1055085a0>
(Pdb) ctx.register_table("t2", t2)
*** TypeError: argument 'table': 'Table' object cannot be converted to 'Table'
@Jimexist , sorry to bother, just wonder whether you have idea about how to resolve the type conversion error in https://github.com/datafusion-contrib/datafusion-python/issues/45#issuecomment-1087051568 (Not sure whether it is a limitation of pyo3, or I miss sth, seems almost there)
Looks like it is not supported in pyo3. According to https://github.com/PyO3/pyo3/issues/1444, even though datafusion-bigtable
use PyTable
from datafusion-python
, after compile, pyo3 thinks the two PyTable
are different types
The key issue is that #[pyclass] stores the pyclass type object in static storage. This means that (if Rust's usual rlib linkage is used) packages A and B will have their own copies of the MyClass type object, and Python will think that they're actually different types coming from the two packages.
Background
I would like to use
datafusion-python
to query Bigtable. In Rust,datafusion-bigtable
have implement BigtableDataSource as custom TableProvider.Problem
I tried to add
register_table
in https://github.com/datafusion-contrib/datafusion-python/pull/46 and expose a pythonBigtableTable
indatafusion-bigtable
at https://github.com/datafusion-contrib/datafusion-bigtable/pull/3.The problem is how to convert python
BigtableTable
to pythonTable
? Or how to serialize/deserialize rustTableProvider
to some Python Object?following is a non-working example, because
bigtable.table()
isTableProvider
(Rust) and have no corresponding python object