Closed SebAlbert closed 6 days ago
As the import error mentions, pyarrow is required for the DBAPI-compatible interface, but you can use the lower-level interface without pyarrow. That's the reason it is not listed as a default required dependency, although this then gives the suboptimal user experience you had, where most users will actually want to have pyarrow available to use the DBAPI itnerface.
But exactly because pyarrow is a quite heavy dependency, as you mention, we want to avoid requiring to pull it in. At the moment there is no more minimal way to install pyarrow using pip (there is work in progress to remove the numpy dependency, and to split the wheel so it is possible to install a more minimal set of pyarrow functionality, but that is not for the short term).
Depending on your use case / what you want to do with the resulting Arrow data from your query, you could look into other Arrow implementations with python bindings such as nanoarrow
.
It's also possible we could provide the dbapi layer using just nanoarrow nowadays/soon.
What would you like help with?
When installing (via
pip
) the packageadbc_driver_postgresql
, I get a runtime error from an import that suggests (and is indeed fixed by) also installingpyarrow
via pip:Should this not be a declared requirement of the python package in the first place?
On the other hand, is there a more minimal way than installing
pyarrow
with 40 MB which in turn ties innumpy
with another 18 MB? It "feels" quite heavy.