serialize DuckDBClient.sql

observablehq / framework

A static site generator for data apps, dashboards, reports, and more. Observable Framework combines JavaScript on the front-end for interactive graphics with any language on the back-end for data analysis.

https://observablehq.com/framework/

ISC License

2.57k stars 122 forks source link

serialize DuckDBClient.sql #1728

Closed mbostock closed 1 week ago

mbostock commented 1 month ago

Ref #1469. This uses a WeakMap keyed by the string literal strings, effectively creating a queue per SQL cell and per sql tagged template literal. I’d prefer to implement a more general solution that also helps out with e.g. concurrent fetch (maybe in the Observable Runtime?), but this is an easy fix for probably the most common instance of this problem. The performance of the mag slider on the SQL page is dramatically improved under fast interaction.

Fil commented 1 month ago

The crucial thing here is that even if another cell uses the same query (SELECT * FROM x WHERE y), its strings will not be === to those of the first query, because a different array will have been created; so no queuing happens between the two independent queries. 👏

mbostock commented 1 month ago

Unfortunately I think this approach has a fundamental flaw as exhibited in this example code:

Promise.all([1, 2, 3].map((i) => sql`SELECT ${i}`))

This code will throw Error: invalidated because the i = 1 and i = 2 cases are “invalidated” by the subsequent i = 3. In other words, the approach in this PR works only if the sql template literal is invoked only once per code block, and any case in which the sql template literal exists within a loop will cause problems.

We could still adapt this fix for SQL cells specifically (by changing the generated code to sql.queue or something), but perhaps I should just look into making the more general runtime-level fix.

Fil commented 1 month ago

Ah… bummer.

If you write Promise.all([1, 2, 3].map((i) => sql(["SELECT", ""], i))) instead it works, because it's a new array each time. MDN explains the difference: “For any particular tagged template literal expression, the tag function will always be called with the exact same literal array, no matter how many times the literal is evaluated.”

Could we salvage this approach if we somehow passed the variable's v._version to know if we are running in the same call or in a subsequent one?

Fil commented 1 week ago

I believe this was superseded by #1748