observablehq / stdlib

The Observable standard library.
https://observablehq.com/@observablehq/standard-library
ISC License
957 stars 83 forks source link

duckdb-wasm@1.24 depends on arrow11 #359

Closed Fil closed 1 year ago

Fil commented 1 year ago

duckdb-wasm@1.24 depends on arrow11 (after https://github.com/duckdb/duckdb-wasm/commit/18dd6c7096fe888dbdfb7e9a1cdd559560d53c97); because of this, with #356 loadDataSource does not create tables from arrays any more.

In this tentative PR I add full support for arrow11 on top of the existing arrow9 and arrow4. It might be enough to support arrow11 only in the case of duckdb, but supporting it as a filetype felt the right thing to do. (Not tested though.)

DEMO: https://observablehq.com/@observablehq/duckdb-1-24-breaks-insertion-in-sql-mode

Here's a unit test I wanted to add, unfortunately it fails because loadArrow() wants an http-resource.

--- a/test/query-test.js
+++ b/test/query-test.js
@@ -1,5 +1,5 @@
 import assert from "assert";
-import {__query} from "../src/table.js";
+import {__query, loadDataSource} from "../src/table.js";
 import it, {invalidator} from "./invalidation.js";

@@ -109,3 +109,12 @@ describe("__query.sql", () => {
+
+describe("loadDataSource", () => {
+  it("loads an array as a sql table", async () => {
+    const [invalidation] = invalidator();
+    const data = [{x: 1}, {x: 2}];
+    const db = await loadDataSource(data, "sql", "data");
+    assert.strictEqual(await __query.sql(db, invalidation)`SELECT * FROM data`, data); // TODO
+  });
+});
mootari commented 1 year ago

Listed below are the JS related changes in the latest apache-arrow releases.

10.0.0:

10.0.1:

11.0.0:

annie commented 1 year ago

tested locally and this looks good from a Data Table & SQL cell perspective. i'm not sure what else needs to be tested though – what parts are you unsure about @Fil?

Fil commented 1 year ago

The code works in all my tests too— the parts that I'm not sure of are:

  1. case 11: { … }: where would that 11 come from? A fileAttachement.arrow({version: 11}) call but on what file? I haven't been able to test this, and not sure it does anything.

Testing now… I have an (old) arrow file on my hard drive, adding it as a FileAttachment and calling:

both seem to return the same thing. The only difference I can spot is (of course) that they have different prototypes, because they're coming from different code bases.

Maybe using arrow11 for version: 9 works too, and we could remove the dependency on version 9. We would need to just default case 9 to version 11.

  1. How to create unit tests so that this does not happen again (see the tentative test in the issue description)
mbostock commented 1 year ago

Maybe using arrow11 for version: 9 works too, and we could remove the dependency on version 9.

Please don’t do this; it breaks backwards compatibility. Just add Arrow 11 as a new dependency.

annie commented 1 year ago

re: tests – i'm working on adding an integration test for arrays in a SQL cell. that should catch any issues at the point where we upgrade stdlib in the monorepo, although a unit test here would catch it earlier, so that would be great too.

Fil commented 1 year ago

OK. So I guess that standing question is, is there any added value of adding {version: 11} as an option to FileAttachment.arrow()? It felt "wrong" not to add it (in the sense that we would be using 11 internally, but not offer it as an option), but I don't know what concrete difference it makes.

mbostock commented 1 year ago

Yes, we should add it; it is needed by the compiler so we’ll need a corresponding PR over there.

annie commented 1 year ago

are we good to re-upgrade stdlib in the monorepo, or do we need the compiler update to go in first?

mbostock commented 1 year ago

Upgrade stdlib first to add Arrow 11, then followup with the compiler update to use it. 👍

annie commented 1 year ago

@mbostock sorry, could you say more about what we need to update in the compiler? i tested this upgrade locally and SQL cells seem to be working as expected. FileAttachment.arrow({version: 11}) works too. is there something else that needs to be updated?

mbostock commented 1 year ago

@annie My mistake. I was thinking of chart cells, but we don’t support Arrow tables as data sources for chart cells yet, so no compiler changes are necessary. This change should stand on its own! 👍

annie commented 1 year ago

ok awesome, thanks for clarifying!