datasette / datasette-acl

Advanced permission management for Datasette
Apache License 2.0
1 stars 0 forks source link

Initial plugin design #1

Closed simonw closed 2 weeks ago

simonw commented 2 weeks ago

Goal is to support:

For the moment I'm going to ignore the execute-sql loophole, where users who can execute SQL can view tables that they do not otherwise have permission to view. We could potentially close that hole later with a mechanism based around SQLite authorizers.

simonw commented 2 weeks ago

The permissions to consider for tables are:

I got Claude to build this UI prototype: https://claude.site/artifacts/0d04e81a-c146-4716-9fad-e0f9aa9982dc

CleanShot 2024-08-29 at 14 58 05@2x

Transcript: https://gist.github.com/simonw/cdeace315f0a8e0bd4b03a845b299cee

simonw commented 2 weeks ago

Also considered if bitmap columns would make sense in SQLite, and got Claude to knock out this demo that uses those to store permissions but can still edit and display them using JSON arrays of descriptive strings:

-- Create a users table with a permissions column
CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    username TEXT NOT NULL,
    permissions INTEGER NOT NULL DEFAULT 0
);

-- Create a permissions lookup table
CREATE TABLE permission_lookup (
    name TEXT PRIMARY KEY,
    value INTEGER
);

-- Insert permission values
INSERT INTO permission_lookup (name, value) VALUES
    ('read', 1),
    ('write', 2),
    ('execute', 4);

-- Insert users with permissions specified as strings
INSERT INTO users (username, permissions)
SELECT 
    'david',
    (SELECT SUM(value) 
     FROM permission_lookup 
     WHERE name IN ('read', 'execute'))
UNION ALL
SELECT 
    'emma',
    (SELECT SUM(value) 
     FROM permission_lookup 
     WHERE name IN ('write', 'execute'))
UNION ALL
SELECT 
    'frank',
    (SELECT SUM(value) 
     FROM permission_lookup 
     WHERE name IN ('read', 'write', 'execute'));

-- Query to show the results
SELECT 
    u.username,
    u.permissions,
    (SELECT json_group_array(pl.name)
     FROM permission_lookup pl
     WHERE (u.permissions & pl.value) != 0) AS permissions_array
FROM users u;
simonw commented 2 weeks ago

Part of this work may inform changes to Datasette core - in particular the need for an efficient way to say "list all users who can do this thing" and "list all tables this user can view". Those may require new plugin hooks which this plugin could then implement.

simonw commented 2 weeks ago

Also relevant: this Claude conversation a while ago about ACLs as a SQL schema: https://gist.github.com/simonw/20b2e8c4d9d9d8d6dee327c221e57205

That one ended up with this schema:

-- Existing tables (simplified for brevity)
CREATE TABLE users (
    user_id INTEGER PRIMARY KEY AUTOINCREMENT,
    username TEXT NOT NULL UNIQUE
);

CREATE TABLE resources (
    resource_id INTEGER PRIMARY KEY AUTOINCREMENT,
    resource_name TEXT NOT NULL UNIQUE
);

CREATE TABLE permissions (
    permission_id INTEGER PRIMARY KEY AUTOINCREMENT,
    permission_name TEXT NOT NULL UNIQUE
);

-- New table for groups
CREATE TABLE groups (
    group_id INTEGER PRIMARY KEY AUTOINCREMENT,
    group_name TEXT NOT NULL UNIQUE
);

-- New table for user-group relationships
CREATE TABLE user_groups (
    user_id INTEGER,
    group_id INTEGER,
    PRIMARY KEY (user_id, group_id),
    FOREIGN KEY (user_id) REFERENCES users(user_id),
    FOREIGN KEY (group_id) REFERENCES groups(group_id)
);

-- Modified ACL table to include group_id
CREATE TABLE acl (
    acl_id INTEGER PRIMARY KEY AUTOINCREMENT,
    user_id INTEGER,
    group_id INTEGER,
    resource_id INTEGER,
    permission_id INTEGER,
    FOREIGN KEY (user_id) REFERENCES users(user_id),
    FOREIGN KEY (group_id) REFERENCES groups(group_id),
    FOREIGN KEY (resource_id) REFERENCES resources(resource_id),
    FOREIGN KEY (permission_id) REFERENCES permissions(permission_id),
    CHECK ((user_id IS NULL) != (group_id IS NULL)),
    UNIQUE(user_id, group_id, resource_id, permission_id)
);
simonw commented 2 weeks ago

As a reminder, the relevant pieces of the Datasette internal DB schema currently look like this: https://docs.datasette.io/en/latest/internals.html#internal-database-schema

CREATE TABLE catalog_databases (
    database_name TEXT PRIMARY KEY,
    path TEXT,
    is_memory INTEGER,
    schema_version INTEGER
);
CREATE TABLE catalog_tables (
    database_name TEXT,
    table_name TEXT,
    rootpage INTEGER,
    sql TEXT,
    PRIMARY KEY (database_name, table_name),
    FOREIGN KEY (database_name) REFERENCES databases(database_name)
);
simonw commented 2 weeks ago

Not sure if I should foreign key directly against those, since there isn't currently a guarantee that (one, two) for a permission check is actually (database, table) - consider views and canned queries and maybe other weird things plugins might do with the system.

Also compound primary keys used as foreign keys are weird, Datasette doesn't know how to render them!

simonw commented 2 weeks ago

I'm tempted to have a ACL table that's just for table permissions, where we know we have a database name and a table name - then separate ACL tables for other types of thing.

For the first version of this I really only need to support permissions on databases and tables I think... though maybe also on canned queries? Should probably cover those too.

simonw commented 2 weeks ago

For the very first version of this just tables would be fine.

simonw commented 2 weeks ago

OK, MVP for this is going to be the ability to assign table permissions to specific users (no groups, nothing other than tables). Accessed via table actions menu. SQL schema as close to the final version as possible.

simonw commented 2 weeks ago

Could the table look like this?

CREATE TABLE acl (
    acl_id INTEGER PRIMARY KEY AUTOINCREMENT,
    actor_id TEXT,
    group_id INTEGER,
    resource_id INTEGER,
    permission_id INTEGER,
    FOREIGN KEY (group_id) REFERENCES groups(group_id),
    FOREIGN KEY (resource_id) REFERENCES resources(resource_id),
    FOREIGN KEY (permission_id) REFERENCES permissions(permission_id),
    CHECK ((actor_id IS NULL) != (group_id IS NULL)),
    UNIQUE(actor_id, group_id, resource_id, permission_id)
);

Then I insert rows to grant people permissions, and that's the whole thing.

simonw commented 2 weeks ago

Schema, using Datasette terminology action_id instead of permission_id and with a acl_resources table that can then key against database and optionally resource.

CREATE TABLE acl_resources (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    database TEXT NOT NULL,
    resource TEXT
);

CREATE TABLE acl_actions (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL UNIQUE
);

-- New table for groups
CREATE TABLE acl_groups (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL UNIQUE
);

-- New table for user-group relationships
CREATE TABLE acl_user_groups (
    user_id TEXT,
    group_id INTEGER,
    PRIMARY KEY (user_id, group_id),
    FOREIGN KEY (group_id) REFERENCES groups(id)
);

CREATE TABLE acl (
    acl_id INTEGER PRIMARY KEY AUTOINCREMENT,
    actor_id TEXT,
    group_id INTEGER,
    resource_id INTEGER,
    action_id INTEGER,
    FOREIGN KEY (group_id) REFERENCES acl_groups(id),
    FOREIGN KEY (resource_id) REFERENCES acl_resources(id),
    FOREIGN KEY (action_id) REFERENCES acl_actions(id),
    CHECK ((actor_id IS NULL) != (group_id IS NULL)),
    UNIQUE(actor_id, group_id, resource_id, action_id)
);
simonw commented 2 weeks ago

Got Claude to write me this query (I asked for it with subselects and CTEs instead of joins for readability):

with user_groups as (
  select group_id
  from acl_user_groups
  where user_id = 'simon'
),
target_resource as (
  select id
  from acl_resources
  where database = 'fixtures' and resource = 'facetable'
),
target_action as (
  select id
  from acl_actions
  where name = 'insert-row'
),
user_permissions as (
    select resource_id, action_id
    from acl
    where actor_id = 'simon'
  union
    select resource_id, action_id
    from acl
    where group_id in (select group_id from user_groups)
)
select count(*)
  from user_permissions
  where resource_id = (select id from target_resource)
  and action_id = (select id from target_action)
simonw commented 2 weeks ago

CleanShot 2024-08-29 at 22 13 47@2x

The use of covering indexes looks good there. The scan acl is worth keeping an eye on if that table gets really huge, but the good thing about user group support is it should help keep that table small too.

simonw commented 2 weeks ago

This project may be the kick I need to say that Datasette actors must have a unque id field.

simonw commented 2 weeks ago

This prototype works! I set my database up with this content:

BEGIN TRANSACTION;
CREATE TABLE acl_resources (
    id integer primary key autoincrement,
    database text not null,
    resource text
);
INSERT INTO acl_resources VALUES(1,'fixtures','facetable');
CREATE TABLE acl_actions (
    id integer primary key autoincrement,
    name text not null unique
);
INSERT INTO acl_actions VALUES(1,'view-instance');
INSERT INTO acl_actions VALUES(2,'view-database');
INSERT INTO acl_actions VALUES(3,'view-database-download');
INSERT INTO acl_actions VALUES(4,'view-table');
INSERT INTO acl_actions VALUES(5,'view-query');
INSERT INTO acl_actions VALUES(6,'execute-sql');
INSERT INTO acl_actions VALUES(7,'permissions-debug');
INSERT INTO acl_actions VALUES(8,'debug-menu');
INSERT INTO acl_actions VALUES(9,'insert-row');
INSERT INTO acl_actions VALUES(10,'delete-row');
INSERT INTO acl_actions VALUES(11,'update-row');
INSERT INTO acl_actions VALUES(12,'create-table');
INSERT INTO acl_actions VALUES(13,'alter-table');
INSERT INTO acl_actions VALUES(14,'drop-table');
CREATE TABLE acl_groups (
    id integer primary key autoincrement,
    name text not null unique
);
CREATE TABLE acl_actor_groups (
    actor_id text,
    group_id integer,
    primary key (actor_id, group_id),
    foreign key (group_id) references groups(id)
);
CREATE TABLE acl (
    acl_id integer primary key autoincrement,
    actor_id text,
    group_id integer,
    resource_id integer,
    action_id integer,
    foreign key (group_id) references acl_groups(id),
    foreign key (resource_id) references acl_resources(id),
    foreign key (action_id) references acl_actions(id),
    check ((actor_id is null) != (group_id is null)),
    unique(actor_id, group_id, resource_id, action_id)
);
INSERT INTO acl VALUES(1,'simon',NULL,1,9);

Then when I used https://github.com/datasette/datasette-unsafe-actor-debug to sign in as user simon I got the insert-row button at the bottom of fixtures/facetable using the datasette-write-ui plugin - which vanished when I removed that acl record.

simonw commented 2 weeks ago

Next steps:

Open question: should there be a way for us to define things with this tool that affect e.g. {"is_admin": true} style actors? Might be possible to do that with a special kind of group, where the group has a JSON definition on it - not sure how that would work though.

simonw commented 2 weeks ago

If there's no special mechanism for matching JSON on actors we could instead have code that dynamically adds or removes the actor from a group based on their JSON, matched against something in config? Then you could configure the plugin like this:

plugins:
  datasette-acl:
    dynamic-groups:
      admin:
        is_admin: true

Now any time we're about to run a permission check on an actor we first confirm that they are a member or NOT a member of the admin group based on matching them against that rule.

We can use actor_matches_allow(actor, allow) for that: https://github.com/simonw/datasette/blob/dc288056b81a3635bdb02a6d0121887db2720e5e/datasette/utils/__init__.py#L1025C5-L1026C1

simonw commented 2 weeks ago

http://127.0.0.1:8002/-/write?database=internal&sql=insert+into+acl_actor_groups+%28actor_id%2C+group_id%29+values+%28%3Aactor_id%2C+%28select+id+from+acl_groups+where+name+%3D+%3Agroup%29%29 is useful for inserting rows.

simonw commented 2 weeks ago

Here's a query that takes an expected list of groups and identifies which groups in the DB need to be added/removed:

with expected_groups as (
  select value as group_name
  from json_each(:expected_groups_json)
),
actual_groups as (
  select g.name as group_name
  from acl_groups g
  join acl_actor_groups ug on g.id = ug.group_id
  where ug.actor_id = :actor_id
)
select 
  'should-add' as status,
  eg.group_name
from expected_groups eg
where eg.group_name not in (select group_name from actual_groups)
  union all
select
  'should-remove' as status,
  ag.group_name
from actual_groups ag
where ag.group_name not in (select group_name from expected_groups)
  union all
select
  'current' as status,
  group_name
from actual_groups

Example (simon is in sales and staff, we tell the query that we would like him to be in ["sales", "bob", "cat"]]):

CleanShot 2024-08-29 at 22 45 31@2x

I could use this to implement dynamic groups, where first the group rules from the configuration are used to determine the groups the user should be in, then we use this query to figure out what we should change.

simonw commented 2 weeks ago

Now that I have dynamic groups I won’t bother with the group management UI for the first version.

simonw commented 2 weeks ago

I'm going to populate the README and ship this as an alpha.

simonw commented 2 weeks ago

Now on PyPI: https://pypi.org/project/datasette-acl/