dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
10k stars 1.63k forks source link

[CT-1346] [Feature] Extend grants config to include usage grants #6063

Closed AlexFrid closed 1 year ago

AlexFrid commented 2 years ago

Is this your first time submitting a feature request?

Describe the feature

#5189 and #5263 introduced grants as node configs.

This feature request seeks to extend the grants config to include usage grants as well.

Currently, we can do this:

models:
  +grants:
    select: ['reporter', 'bi']

The extended version could look very similar:

models:
  +grants:
    select: ['reporter', 'bi']
    warehouse: ['transform']
    database: ['prod']
    schema: ['analytics']

or

models:
  +grants:
    select: ['reporter', 'bi']
    usage:
      - warehouse: ['transform']
      - database: ['prod']
      - schema: ['analytics']

Describe alternatives you've considered

The current recommended way of doing usage grants is by using on-run-end Example:

on-run-end:
  - "{% for schema in schemas %}grant usage on schema {{ schema }} to group reporter; {% endfor %}"

A similar approach is also in the dbt labs partner engineering team demos)

Who will this benefit?

This will benefit the same people who appreciate grants as node configs and push the envelope further in terms of how dbt can enable people to configure grants.

As select grants are often used in conjunction with usage grants, this will allow people to consolidate their workflow to fully use the grants config for standard things, instead of having it fragmented between the grants config for select grants and on-run-end for usage grants.

The dbt labs partner engineering team noticed that this would be useful as we updated our demos and enablement recommendations to use the grants config, but couldn't fully replace our current macro since we are using usage grants.

@ernestoongaro has also noticed people asking about this.

Are you interested in contributing this feature?

Yes

Anything else?

No response

amychen1776 commented 2 years ago

I can see why granting usage on the database/schema would be very helpful (especially for blanket configs on sub-directories) but I don't think warehouse usage should be tied into it. Our best practices align with the transformers having their own warehouse while users accessing the objects downstream like a BI service user have their own. Also, the concept of a warehouse and permissions on that level are adapter specific (i.e Snowflake only at this point)

stevenkoppenol commented 2 years ago

Hi there, let me shine some light on this from a customer's point of view. We are on BigQuery and we have the need to secure data access on individual user level. So we have different data marts (schemas) where we do grant: roles/bigquery.dataViewer.

The result is... nothing. Because users cannot connect until we give them a schema-level role.

In our point of view, as DBT creates the schemas it is also responsible for granting roles to them. We now fall back to an on-run-end macro just like the alternative mentioned above.

grant `projects/{{ project_id }}/roles/DataPlatformUser` on schema `{{ project_id }}.{{ schema_name }}` to '{{ member }}';

This then results in too wide access for all users, because they can connect and view the structure of all data marts (schemas) including the ones they cannot query.

amychen1776 commented 2 years ago

@stevenkoppenol My curiosity is should this be ran at every dbt run or should there actually be a separate project/higher level to manage this? Because roles to me should only be granted once rather than continuously versus usage which might need to be granted multiple times due to the object being dropped.

To me, db management can 100% be in the dbt project but the granularity is unclear to me on how things should be applied because there are some things that only need to be run once or twice.

stevenkoppenol commented 2 years ago

Well, every run there is a chance of (a) a new data mart appearing as a result of a branch being merged to main or (b) manually assigned permissions that need to be revoked (exceptional, I admit). So in my view it should run every time, comparing "as is" to "to be" and applying changes as needed.

jtcohen6 commented 2 years ago

@AlexFrid Thanks for opening, and kicking off the disucssion!

I agree that this is functionality we should look to add, and I also agree with this statement:

In our point of view, as DBT creates the schemas it is also responsible for granting roles to them.

IMO grants should be configured at the level of the object to be granted on, and we're missing a place in dbt project to configure schemas. dbt is already in the business of creating schemas, if they don't already exist. Today those schemas are defined implicitly, via models, with options for complex env-specific behavior. There should be a way to define and configure them more explicitly, too.

Check out the discussion over in https://github.com/dbt-labs/dbt-core/discussions/5781, which includes some concept art for what a grants config defined on schemas might want to look like. And there's a draft PR (https://github.com/dbt-labs/dbt-core/pull/5392) that may take us a first step on the way there.

github-actions[bot] commented 1 year ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 1 year ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.