Permissions issues when creating dynamic table that selects across databases

apd-bbaker commented 1 month ago

Describe the bug administratorb_role has ownership of both db_aowner_d_role and db_b__owner_d_role. The role permissions do not seem to bubble up to the b_role level when it comes to dynamic tables.

When creating a dynamic table in db_a that selects from a table in db_b, an error is thrown that there's insufficient access to the database to create the dynamic table. After some testing, it seems that the db_aownerd_role doesn't have necessary permissions to access objects owned by db_aownerd_role.

After granting the access manually, the next error that occurs is on the warehouse with a similar permissions issue. Manually granting access does finally allow the dynamic table to be created.

Expected behavior SnowDDL automatically grants necessary access to necessary objects for dynamic tables' unique permission structure of ownership and warehouse dependency. This may be (probably) an additional config item that I overlooked.

Attach log N/A

Attach YAML config (if applicable) N/A

littleK0i commented 1 month ago

Dynamic tables are running with object owner privileges. It is how Snowflake designed it.

Take a look at large block of owner_* grant parameters in SCHEMA config: https://docs.snowddl.com/basic/yaml-configs/schema

owner_schema_read - list schemas which should be available to "owner" role of this schema;
owner_warehouse_usage - list warehouses which should available to "owner" role of this schema;

We considered automated grants at some point, but decided against it. Currently it is not possible to reliably parse SQL and identify all dependent objects. Also, it is possible to query dynamically named objects via FUNCTION or PROCEDURE.

apd-bbaker commented 4 weeks ago

@littleK0i We're using database owner permissions model rather than schema due to the use of some replication platforms and dbt that need to be able to create and drop their own schemas. I believe database level is the correct way to approach this, but let me know if I've misunderstood the intent behind that permission model.

The custom permission model we have in place that has been working well with Fivetran + dbt is below. Since we're following a medallion data lakehouse architecture with bronze/silver/gold, we need the database owners of higher layers to be able to have full read access to the layers below it to transform the data using dynamic tables.

What I think I need is a owner_database_read option on the database config that allows me to grant additional permission to the lower databases as the database owner of a higher layer. I tried doing this with owner_global_roles, but since that's assumed to be unmanaged by SnowDDL it doesn't allow me to parameterize the database owner role names with ${env_prefix}.

Thoughts on how to go about this?

custom_permission_model:
  ruleset: database_owner

  owner_create_grants:
    - alert
    - dynamic_table
    - event_table
    - external_table
    - file_format
    - function
    - materialized_view
    - pipe
    - procedure
    - sequence
    - stage
    - stream
    - table
    - task
    - view

  owner_future_grants:
    alert: [ownership]
    dynamic_table: [ownership]
    event_table: [ownership]
    external_table: [ownership]
    file_format: [ownership]
    function: [ownership]
    materialized_view: [ownership]
    pipe: [ownership]
    procedure: [ownership]
    sequence: [ownership]
    stage: [ownership]
    stream: [ownership]
    table: [ownership]
    task: [ownership]
    view: [ownership]

  read_future_grants:
    dynamic_table: [select]
    external_table: [select, references]
    file_format: [usage]
    function: [usage]
    materialized_view: [select, references]
    procedure: [usage]
    sequence: [usage]
    stage: [read, usage]
    stream: [select]
    table: [select, references]
    view: [select, references]

littleK0i commented 4 weeks ago

@apd-bbaker , the tool originally comes from a place where ALL objects are defined in config, and entire DEV environment could be fully re-created from scratch: every database, every schema, every table. Naturally, it makes schema-level permissions and roles the best choice.

Unfortunately, popular tools like Fivetran / Airbyte / DBT are trying to create their own schemas. It is a terrible approach, which is going it change in the next few years, I guess. But currently it exists nonetheless.

So the current intention is to use schema_owner permission models for everything we can control. And database_owner permission models from external tools with hardcoded CREATE SCHEMA.

On the other hand... from technical perspective there is nothing stopping us from granting database_read roles to database_owner roles of other database. I'll consider adding this option.

The only issue is... maybe such grant would be too permissive. Not sure if you really want to grant access from "gold" layer to literally everything in "silver" layer. Maybe doing it on per-schema basis would be better and safer at the end of the day.

What you can do now with the current version.

Define all known schemas in config by creating empty directory with params.yaml file. It would lead to creation of "schema roles" in addition to "database roles".
Use owner_schema_read to grant read access from specific schemas to specific schemas only. If schema is defined in config, it should work.

For example, financial "silver" layer can have access only to Fivetran schemas FIVETRAN, BRAINTREE, REVENUECAT. Marketing silver layer can have access only to Fivetran FACEBOOK, GOOGLE_ADS. Etc.

Give it a try, maybe it would be sufficient.

Dropping schemas in SnowDDL controlled databases by anything other than SnowDDL is currently not supported, and it would be difficulty to support.

In general, it should be possible to pre-create all schemas required for all transformation layers and avoid having temporary schemas altogether. It improves user experience, simplifies investigations and permissions.

littleK0i commented 3 weeks ago

@apd-bbaker , parameters owner_database_read and owner_database_write are now available (starting from 0.34.2).

It works only if both database have permission model with database_owner ruleset.

apd-bbaker commented 3 weeks ago

@littleK0i Awesome, thank you! I’ll backlog this to test it out.

littleK0i / SnowDDL

Permissions issues when creating dynamic table that selects across databases #134