observablehq / stdlib

The Observable standard library.
https://observablehq.com/@observablehq/standard-library
ISC License
957 stars 83 forks source link

Add DuckDBClient as a recommended library #310

Closed mkfreeman closed 1 year ago

mkfreeman commented 1 year ago

Resolves #308

Outstanding issues:

domoritz commented 1 year ago

This is great. Thanks for putting this together. Any recommendations for how I can encourage people to get off of the client from https://observablehq.com/@cmudig/duckdb when this pull request is merged? I can add a message at the top but was wondering whether you have some recommendations beyond that.

mootari commented 1 year ago

@domoritz You can prefix the client's cell with a console.warn that issues a deprecation notice. The warning will only be issued the first time the cell is referenced.

You can see an example for this by adding the following two cells to a notebook:

import {signature} from '@mootari/toolbox'
signature
image
mbostock commented 1 year ago

@domoritz Thanks! I’ll put up a notebook once its released and yes, it’d be great if you could add a banner at the top of your notebook that directs people to Observable’s new DuckDBClient. I don’t have any other recommendations at this time, but I’m open to suggestions. Maybe it’d be nice if you could notify people importing your notebook somehow…

domoritz commented 1 year ago

Before you publish the library, would you mind making a pre-release that I can try in a notebook just to see whether there are any issues I find that way?

mbostock commented 1 year ago

Here, try this:

import {DuckDBClient} from "1710adacec0d31ba"

https://observablehq.com/d/1710adacec0d31ba

(Though note that you won’t be able to test it with Arrow file attachments yet, since this PR depends on changes to FileAttachment that allow us to use Apache Arrow v9.)

Here’s an example:

https://observablehq.com/d/c07e0c90b33fb7f9

domoritz commented 1 year ago
c = new DuckDBClient()
c.query(`SELECT
  v::INT AS x,
  (sin(v/50.0) * 100 + 100)::INT AS y
FROM generate_series(0, 1000) AS t(v)`)

doesn't work anymore (TypeError: undefined is not an object (evaluating 'this._db.connect')). Would it make sense to init the db in the constructor?

mbostock commented 1 year ago
c = new DuckDBClient()

You have to pass-in an existing AsyncDuckDB instance to the constructor now. The recommended way to create a DuckDBClient is to call DuckDBClient.of(tables) passing in a tables object whose keys correspond to table names and whose values represent tabular data. If you want an empty database you can say

c = DuckDBClient.of()
domoritz commented 1 year ago

Yep, that works. I wonder whether it would still be good to just init a db if you don't get one in the constructor. Ignore my suggestion if the common pattern for observable DBs is to not do that.

mbostock commented 1 year ago

I wonder whether it would still be good to just init a db if you don't get one in the constructor.

It requires loading duckdb-wasm which is async, and constructors have to be synchronous.