dagster-io / hooli-data-eng-pipelines

Example Dagster Cloud code for the Hooli Data Engineering organization.
72 stars 15 forks source link

Add model groups and tags #73

Closed cnolanminich closed 5 months ago

cnolanminich commented 5 months ago

This PR adds two pieces of useful metadata to hooli:

owners using the dbt group functionality

The PR creates 3 groups for the hooli project:

And applies those group access at the dbt project level. It also shows that you can override a specific folder path with the dagster meta property. Note that I moved the views to use @dbt_assets because the deprecated load_assets_from_dbt_project() did not propogate owners properly.

tags

I added a "core_kpi" tag to both a set of dbt assets as well as upstream and downstream assets to show how cool it is.

image

This can't be merged until 1.7 is live, but I did test it locally from master.

github-actions[bot] commented 5 months ago

Your pull request is automatically being deployed to Dagster Cloud.

Location Status Link Updated
batch_enrichment View in Cloud Apr 05, 2024 at 12:41 AM (UTC)
data-eng-pipeline View in Cloud Apr 05, 2024 at 12:41 AM (UTC)
snowflake_insights View in Cloud Apr 05, 2024 at 12:41 AM (UTC)
basics View in Cloud Apr 05, 2024 at 12:41 AM (UTC)
demo_assets View in Cloud Apr 05, 2024 at 12:41 AM (UTC)
slopp commented 5 months ago

Thank you for taking this!!

It is a little confusing to me that we are creating and mapping users to groups in the dbt metadata, do these overlap with groups/teams we have defined in Dagster Cloud (via okta)?

If the dbt metadata field thing can support supplying owners as a team name instead of an individual email that might make more sense? But I'm not sure if thats possible.

eg in regular asset code it'd be

@asset(
  owner = ["a_dagster_cloud_team_name"]
)
def...
cnolanminich commented 5 months ago

Thank you for taking this!!

It is a little confusing to me that we are creating and mapping users to groups in the dbt metadata, do these overlap with groups/teams we have defined in Dagster Cloud (via okta)?

If the dbt metadata field thing can support supplying owners as a team name instead of an individual email that might make more sense? But I'm not sure if thats possible.

eg in regular asset code it'd be


@asset(

  owner = ["a_dagster_cloud_team_name"]

)

def...

So dbt groups probably map closer to our team concept than a user. I think it would be more like data@hooli.com or whatever -- I can change the fake owners to reflect "shared email" addresses vs. users (since I agree that is confusing).

See here for more: https://docs.getdbt.com/docs/build/groups

slopp commented 5 months ago

The way our owners tag works is that it is a team name, not a team email alias. I think we need ask the devs if we can adjust how we incorporate the dbt group metadata and map it to asset owners, since it won't always make sense to map the email address to the asset owner. Would you be able to follow up with Rex?

cnolanminich commented 5 months ago

@slopp added an example of what it can look like to override the owner and add the "teams:" flag. For the "team" are you envisioning that it would map 1:1 with the "Teams" in Hooli (like Executives, "Programmers", and "Stakeholders"?

image
slopp commented 5 months ago

This is looking great!

are you envisioning that it would map 1:1 with the "Teams" in Hooli (like Executives, "Programmers", and "Stakeholders"?

Yes, I think it needs to map 1:1 for the asset owner alerts to work. (That said, if you would prefer we can definitely update the teams in Dagster Cloud, those 3 groups were chosen arbitrarily without much thought).

cnolanminich commented 5 months ago

@slopp ok cool, done.

At some point we can change the names of the teams / add more, but I just wanted to get this in good shape