dbt-labs / dbt-snowflake

dbt-snowflake contains all of the code enabling dbt to work with Snowflake
https://getdbt.com
Apache License 2.0
280 stars 172 forks source link

[ADAP-742] [Feature] Grants should handle database and schema level access. #715

Open jaysobel opened 1 year ago

jaysobel commented 1 year ago

Is this your first time submitting a feature request?

Describe the feature

The dbt grants config presents as a declarative means to grant select on the output of a model. However, it does not cover database and schema level grants, resulting in incomplete behavior, relative to the current documentation:

You can use the grants field to set permissions or grants for a resource. When you run a model, seed or seed, or snapshot a snapshot, dbt will run grant and/or revoke statements to ensure that the permissions on the database object match the grants you have configured on the resource.

Currently dbt can write a model into a schema and grant to a role when the role does not have usage on the schema, resulting in a 'successful run' that cannot actually be used as described. The docs above are true in only a narrow sense that the "permissions on the database object match the grants", but not the practical sense that the role can now use the database object.

As an example,

{{
    config(
        materialized = 'table',
        schema = 'int',
        grants = {'select': ['role_a', 'role_b']}
    )
}}

select ...

If role_b does not have usage on the int schema, then they will not be able to select from the table. dbt will have granted select access on the table object, but not usage on the schema object.

use role role_b;
select * from int.int_user_orders;
---> Schema 'DEV.INT' does not exist or not authorized.

This feels like an unexpected outcome given the declarative(-ish?) nature of other configs.

This issue suggests that if a model is configured with grants = {'select': ['role_b']} then role_b should be able to select from it after it is run, regardless of the database and schema, and presumably by granting usage on both resources to the role.

use role role_a;
select * from int.int_user_orders;
---> return data

Fun fact: "schemas" and "schemata" are both valid plurals of schema.

Describe alternatives you've considered

A job for grant select on future tables ?

Covered in the Appendix here.

grant select on future tables in <database> can only be invoked by SECURITYADMIN and ACCOUNTADMIN roles (docs) or delegated as manage grants, but it's a powerful global privilege (the role can grant anything to itself). It's unclear why "future stuff in one schema" is the same bucket as "current stuff, everywhere".

grant usage on schema is more than just table access!

Yes, and maybe there should be some call-out, or additional opt-in flag to make it clear that schema usage is being granted.

From my experience struggling with grants, and what I've seen in a few large projects, a common reaction to insufficient privilege errors is to sling more and broader grants around 'until it works'. In practice, this feature might have the opposite effect and actually improve the average level of grant specificity by averting rage-grants.

Current art

dbt docs Blog: Updating our permissioning guidelines: grants as configs in dbt Core v1.2

This issue basically suggests Option B happen automatically.

Revoking Usage?

The current feature grants and revokes. If the feature is extended to grant usage will it also revoke usage from schemas, for example, when a role has no further objects within the schema from which it can select?

Make this Snowflake's Problem

As Michael Urrutia points out in this Slack thread this is kind of Snowflake's fault for grant select on db.schema.table not doing what it says on the label (and creating yet another opportunity for dbt to be a superior abstraction).

I will write a letter to the North Pole suggesting something from the Snowfolk; perhaps a more nuanced form of manage grants that allows the owner of a schema to grant future select within said schema?

Who will this benefit?

Primary applications are not mission critical (as far as I know). Some examples:

  1. Writing models into novel schemas without worrying about grants! dbt will make a novel schema for you for free. why not grant usage on it too?
  2. Advanced cases of the above - like CI/CD for such changes. There may be parallels here to dbt clone in 1.6

More broadly, Snowflake's database and schema concepts feel like a filing systems with just two level, like the fixed feature_1/2/3 columns of Activity Schema™️. dbt bakes in plenty of reliance on this addressing system. How many steps are left before a schema and a database are dbt things? Adding +schema and +grants configs in dbt_project.yml folder sections feels dangerously close to just treating these things as things. And this is dbt-snowflake after all.

I'm interested in hearing others' takes on whether dbt should do more Snowflake-y thing in general. Grants seem like a step along a path. At this point, my thinking is that because it is unavoidable, it should at least work well.

Are you interested in contributing this feature?

probably not

Anything else?

No response

dataders commented 1 year ago

two questions

  1. is usage on the schema required for all model-level privileges? or just SELECT?
  2. does my restatement of desired behavior make sense to you? perhaps there's a more optimal order of operations?

usecase

if a dbt model is defined with the following grants configs

{{
    config(
        grants = {'select': ['role_a']}
    )
}}
select ...

current behavior

  1. checks if role_a has SELECT privilege on model
  2. if it doesn't already have the privilege
    1. grants SELECT to role_a

desired behavior

  1. checks if role_a has SELECT privilege on model
  2. if not
    1. check if role_a has USAGE privilege on schema
    2. if not grant USAGE to role_a on schema
    3. grant SELECT to role_a a on model

potential solution

dbt-snowflake currently relies directly on the grants logic implemented in dbt-core's global project via all the default__ macros.

The first place I thought where this new logic could be injected is within default__get_dcl_statement_list. We might add a snowflake__get_dcl_statement_list macro to this adapter and modify the logic so that when privilege = "SELECT", an additional get_dcl_macro call can be added and appended to list of grant statements to be executed, namely get_dcl_macro(relation.schema, 'USAGE', grantee).

jaysobel commented 1 year ago

1: I'm not an expert, but I believe yes.

2: Maybe the indentation is messed up, but I believe you can be in a state where you have select but not usage which I don't see in the flow.

dataders commented 1 year ago

just came across https://github.com/dbt-labs/dbt-snowflake/issues/527#issuecomment-1476862791 -- @jaysobel any instinct on what should be done in the case of managed schemas?

jaysobel commented 1 year ago

just came across #527 (comment) -- @jaysobel any instinct on what should be done in the case of managed schemas?

@dataders No intuition

sfc-gh-afedorov commented 6 months ago

a more nuanced form of manage grants that allows the owner of a schema to grant future select within said schema

iirc this is called discretionary access control and is possible (and my preferred way to do future grants) if you create the schema with managed access. the security / account admins can do it on any schema in the account but it's not immediately obvious how to make that work at scale.

sfc-gh-afedorov commented 6 months ago

the way we handle it is to ignore any grant requests on managed schemas (details in #527). in the native implementation, a dbt warning or error both seems appropriate, but with multiple deployments with different RBAC models it would be nice to have a way to customize what role names are present in a given target, e.g. via a generate_role_name which can return None to suppress the grant.

github-actions[bot] commented 2 weeks ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

jeff-skoldberg-gmds commented 2 weeks ago

@dataders , I'm commenting to keep this from auto-closing. I'm actually surprised dbt doesn't already do this.

jeff-skoldberg-gmds commented 2 weeks ago

I think the fix is simpler than what was described above. At the end of the run, dbt can inspect the manifest come up with a list of roles + schemas required. Then just grant usage on those schemas to those roles. There's no harm in repeating the usage grant on every run.

Either way you have to do something on every run. Either you are checking existing grants and doing "if else" logic... or, you just repeat the grant in a post-hook-like fashion.

Implement it how you see best, but in order for "dbt grants" to be the single place we manage grants, we need this!

(The same problem exists in other adaptors that I've used; Redshift for example)