os-climate / data-platform-demo

Apache License 2.0
3 stars 7 forks source link

Trino Vault demo needs to show row-level transparency/blocking based on masks #36

Open MichaelTiemannOSC opened 2 years ago

MichaelTiemannOSC commented 2 years ago

Table-level access permissions are part of the standard Trino configuration definitions. Users without table-level access cannot see any rows or columns in the tables, no matter what additional permissions those row or column-level permissions grant.

For tables that are visible to users, the Trino Vault demo (https://github.com/os-climate/data-platform-demo/blob/master/notebooks/trino-data-vault-demo.ipynb) demonstrates how column-level permissions can be granted and/or enforced.

A next step is the demonstration of being able to grant/enforce row-level permissions. The row-level permissions include:

Imagine a data provider maintains a table with 3500 rows. For those with "customer" credentials, all rows are marked transparent. For those with "evaluation" credentials, 15 rows are marked "transparent", 285 rows are default, and 3200 rows are marked "blocked". For those with "public" credentials, 15 rows are marked "default" and 3485 rows are marked "blocked". The public user can evaluate the functionality of applying their own column-level visibility schemes using 15 rows of data from the provider. These numbers are for example purposes, and may be changed to make the example more representative of consensus.

MichaelTiemannOSC commented 2 years ago

We talked this morning about making the demo use a special trino instance that's not connected to osc_datacommons_dev. As we migrate to that instance, it would be good to create roles for provider1, provider2, and provider3 so that we can model how data providers make their data available and define who has access to what. Then the dev, quant, and user profiles can be used to test what can be accessed from each of these providers (individually or as part of waterfall, etc).

erikerlandson commented 2 years ago

Do you think it will work to use our 3 demo-users for both providers and dev/quant/public?

MichaelTiemannOSC commented 2 years ago

I think it's fine for our three users to be dev, quant, and public respectively.

We need three more users allocated, provider1, provider2, and provider3.

erikerlandson commented 2 years ago

I have created 3 more demo users, so we now have 6 github demo users: os-climate-user1 through os-climate-user6.