Open asg017 opened 1 year ago
We discussed this in-person this morning and these notes reflect what we talked about perfectly.
I've had so many bugs with plugins that I've written myself that have forgotten to special-case the _internal
database when looping through datasette.databases.keys()
- removing it from there entirely would help a lot.
Just one tiny disagreement: for datasette-comments
I think having it store things in _internal
could be an option, but in most cases I expect users to chose NOT to do that - because being able to join against those tables for more advanced queries is going to be super useful.
Show me all rows in foia_requests
with at least one associated comment in datasette_comments.comments
kind of tihng.
But yes, I'm a big +1 on this whole plan.
@simonw what do you think about adding a DATASETTE_INTERNAL_DB_PATH
env variable, where when defined, is the default location of the internal DB? This means when the --internal
flag is NOT provided, Datasette would check to see if DATASETTE_INTERNAL_DB_PATH
exists, and if so, uses that as the internal database (and would fallback to an ephemeral memory database)
My rationale: some plugins may require, or strongly encourage, a persistent internal database (datasette-comments
, datasette-bookmarks
, datasette-link-shortener
, etc.). However, for users that have a global installation of Datasette (say from brew install
or a global pip install
), it would be annoying having to specify --internal
every time. So instead, they can just add export DATASETTE_INTERNAL_DB_PATH="/path/to/internal.db"
to their bashrc/zshrc/whereever to not have to worry about --internal
The current
_internal
database is used by Datasette core to cache info about databases/tables/columns/foreign keys of databases in a Datasette instance. It's a temporary database created at startup, that can only be seen by the root user. See an example_internal
DB here, after logging in as root.The current
_internal
database has a few rough edges:datasette.databases
, so many plugins have to specifically exclude_internal
from their queries examples hereAdditionally, it would be really nice if plugins could use this
_internal
database to store their own configuration, secrets, and settings. For example:datasette-auth-tokens
creates a_datasette_auth_tokens
table to store auth token metadata. This could be moved into the_internal
database to avoid writing to the gues databasedatasette-socrata
creates asocrata_imports
table, which also can be in_internal
datasette-upload-csvs
creates a_csv_progress_
table, which can be in_internal
datasette-write-ui
wants to have the ability for users to toggle whether a table appears editable, which can be either indatasette.yaml
or on-the-fly by storing config in_internal
In general, these are specific features that Datasette plugins would have access to if there was a central internal database they could read/write to:
datasette.yaml
file works, but can be tedious to restart the server every time. Plugins can define their own configuration table in_internal
, and could read/write to it to store configuration based on user actions (cell menu click, API access, etc.)_internal
(possibly as a temporary table) instead of managing their own caching solution._internal
for others to audit later.datasette-upload-csvs
,datasette-litestream
,datasette-socrata
) perform tasks that run for a really long time, and want to give continue status updates to the user. They can store this info inside_internal
_internal
Proposal
_internal
fromdatasette.databases
property.datasette.get_internal_db()
method that returns the_internal
database, for plugins to use--internal internal.db
flag. If provided, then the_internal
DB will be sourced from that file, and further updates will be persisted to that file (instead of an in-memory database)_datasette_internal
table to mark it a an "datasette internal database"datasette serve
, we check for the existence of the_datasette_internal
table. If it exists, we assume the user provided that file in error and raise an error. This is to limit the chance that someone accidentally publishes their internal database to the internet. We could optionally add a--unsafe-allow-internal
flag (or database plugin) that allows someone to do this if they really want to.New features unlocked with this
These features don't really need a standardized
_internal
table per-say (plugins could currently configure their own long-time storage features if they really wanted to), but it would make it much simpler to create these kinds of features with a persistent application database.datasette-comments
: A plugin for commenting on rows or specific values in a database. Comment contents + threads + email notification info can be stored in_internal
_internal
, or a URL link shortener_internal