simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.59k stars 691 forks source link

Design plugin hook for extras #1720

Closed simonw closed 2 years ago

simonw commented 2 years ago

Refs:

I realized that this is a really natural plugin hook - and if I design it as a hook I can implement Datasette's core extras as default plugins.

simonw commented 2 years ago

Places this plugin hook (or hooks?) should be able to affect:

I'm going to combine those last two, which means there are three places. But maybe I can combine the table one and the row one as well?

simonw commented 2 years ago

I'm going to keep table and row separate. So I think I need to add three new plugin hooks:

simonw commented 2 years ago

There are four existing plugin hooks that include the word "extra" but use it to mean something else - to mean additional CSS/JS/variables to be injected into the page:

I think extra_* and *_extras are different enough that they won't be confused with each other.

simonw commented 2 years ago

Actually I'm going to imitate the existing register_* hooks:

So I'm going to call the new hooks:

They'll return a list of async def functions. The names of those functions will become the names of the extras.

simonw commented 2 years ago

What would the existing https://latest.datasette.io/fixtures/simple_primary_key/1.json?_extras=foreign_key_tables feature look like if it was re-imagined as a register_row_extras() plugin?

Rough sketch, copying most of the code from https://github.com/simonw/datasette/blob/579f59dcec43a91dd7d404e00b87a00afd8515f2/datasette/views/row.py#L98

from datasette import hookimpl

@hookimpl
def register_row_extras(datasette):
    return [foreign_key_tables]

async def foreign_key_tables(datasette, database, table, pk_values):
    if len(pk_values) != 1:
        return []
    db = datasette.get_database(database)
    all_foreign_keys = await db.get_all_foreign_keys()
    foreign_keys = all_foreign_keys[table]["incoming"]
    if len(foreign_keys) == 0:
        return []

    sql = "select " + ", ".join(
        [
            "(select count(*) from {table} where {column}=:id)".format(
                table=escape_sqlite(fk["other_table"]),
                column=escape_sqlite(fk["other_column"]),
            )
            for fk in foreign_keys
        ]
    )
    try:
        rows = list(await db.execute(sql, {"id": pk_values[0]}))
    except QueryInterrupted:
        # Almost certainly hit the timeout
        return []

    foreign_table_counts = dict(
        zip(
            [(fk["other_table"], fk["other_column"]) for fk in foreign_keys],
            list(rows[0]),
        )
    )
    foreign_key_tables = []
    for fk in foreign_keys:
        count = (
            foreign_table_counts.get((fk["other_table"], fk["other_column"])) or 0
        )
        key = fk["other_column"]
        if key.startswith("_"):
            key += "__exact"
        link = "{}?{}={}".format(
            self.ds.urls.table(database, fk["other_table"]),
            key,
            ",".join(pk_values),
        )
        foreign_key_tables.append({**fk, **{"count": count, "link": link}})
    return foreign_key_tables
simonw commented 2 years ago

Passing pk_values to the plugin hook feels odd. I think I'd pass a row object instead and let the code look up the primary key values on that row (by introspecting the primary keys for the table).

simonw commented 2 years ago

Let's try sketching out a register_table_extras plugin for something new.

The first idea I came up with suggests adding new fields to the individual row records that come back - my mental model for extras so far has been that they add new keys to the root object.

So if a table result looked like this:

{
  "rows": [
    {"id": 1, "name": "Cleo"},
    {"id": 2, "name": "Suna"}
  ],
  "next_url": null
}

I was initially thinking that ?_extra=facets would add a "facets": {...} key to that root object.

Here's a plugin idea I came up with that would probably justify adding to the individual row objects instead:

This could also work by adding a "check404s": {"url-here": 200} key to the root object though.

I think I need some better plugin concepts before committing to this new hook. There's overlap between this and how I want the enrichments mechanism (see here) to work.

simonw commented 2 years ago

Some of the things I'd like to use ?_extra= for, that may or not make sense as plugins:

Looking at https://github-to-sqlite.dogsheep.net/github/commits.json?_labels=on&_shape=objects for inspiration.

I think there's a separate potential mechanism in the future that lets you add custom columns to a table. This would affect .csv and the HTML presentation too, which makes it a different concept from the ?_extra= hook that affects the JSON export (and the context that is fed to the HTML templates).

simonw commented 2 years ago

Sketching out a ?_extra=statistics table plugin:

from datasette import hookimpl

@hookimpl
def register_table_extras(datasette):
    return [statistics]

async def statistics(datasette, query, columns, sql):
    # ... need to figure out which columns are integer/floats
    # then build and execute a SQL query that calculates sum/avg/etc for each column
simonw commented 2 years ago

Had a thought: if a custom HTML template is going to make use of stuff generated using these extras, it will need a way to tell Datasette to execute those extras even in the absence of the ?_extra=... URL parameters.

Is that necessary? Or should those kinds of plugins use the existing extra_template_vars hook instead?

Or maybe the extra_template_vars hook gets redesigned so it can depend on other extras in some way?

simonw commented 2 years ago

I bet there's all kinds of interesting potential extras that could be calculated by loading the results of the query into a Pandas DataFrame.

simonw commented 2 years ago

The proposed plugin for annotations - allowing users to attach comments to database tables, columns and rows - would be a great application for all three of those ?_extra= plugin hooks.

simonw commented 2 years ago

I think the rough shape of the three plugin hooks is right. The detailed decisions that are needed concern what the parameters should be, which I think will mainly happen as part of:

simonw commented 2 years ago

Closing this because I have a good enough idea of the design for now - the details of the parameters can be figured out when I implement this.