Open keller-mark opened 3 months ago
The problem is that DuckDB loads its own instance of Arrow and then does instanceof
checks against those symbols. If you load Arrow from a different URL it becomes a separate module and those checks will fail.
The workaround (for now) is to load the exact same module:
arrow = import('https://cdn.observableusercontent.com/npm/apache-arrow@11.0.0/+esm')
For what it’s worth, this isn’t an issue with Observable Framework because we expressly override dependency resolution to ensure a consistent version of Apache Arrow. It would be better if DuckDB used duck testing instead of instanceof
, though. (And c’mon, you’d think DuckDB would know to use “duck” testing… 🦆)
Though, there is a separate issue with Observable Framework which is that db.describeTables
is currently broken because we’ve switched to returning Arrow tables from queries for performance. But I have a fix for that latter issue up at https://github.com/observablehq/framework/pull/1068.
Thanks for the info and the workaround! It seems the instanceof
checks are happening within the Arrow source code (possibly one of these lines) and not the DuckDB source code (insertArrowTable source), so another workaround is to run arrow.tableToIPC
(followed by conn.insertArrowFromIPCStream
) using the same Arrow library instance that was used to run arrow.tableFromArrays
(example in notebook). Since buffer
is a Uint8Array, there are no instanceof issues (though at the same time, it could potentially be any Uint8Array).
Is your feature request related to a problem? Please describe.
Is DuckDBClient.of() intended to work with arrow Table objects?
The following code snippet returns an empty list of tables.
All examples i can find use FileAttachments. The stdlib source code seems to indicate this is possible but I cannot find examples or tests to reference.
Describe the solution you'd like
DuckDBClient.of
with Arrow table objects directly (rather than FileAttachments), specifically whether a CREATE TABLE step is required vs. implicit based on the Arrow table schema.Describe alternatives you've considered
Additional context
Minimal reproducer: https://observablehq.com/d/e21c08e832074f40