dbt-labs / dbt-docs

Auto-generated data documentation site for dbt projects
Apache License 2.0
139 stars 74 forks source link

[Feature] Display the model-level constraints in the documentation generated by dbt #507

Open salimmoulouel opened 4 months ago

salimmoulouel commented 4 months ago

Is this your first time submitting a feature request?

Describe the feature

When I run dbt docs generate, it fails to include constraints defined at the model level in the documentation. For instance, if I establish a primary key at the model level, this information isn't reflected in the documentation. Instead, it only appears when I define the primary key at the column constraint level. However, because I'm utilizing dbt-bigquery, I'm restricted to declaring only one primary key field at the column level. This necessitates the declaration of primary keys at the model level, thereby creating a discrepancy in the documentation.

Describe alternatives you've considered

To enhance the documentation process in dbt for dbt-bigquery, consider implementing the ability to define multiple primary key constraints at the column level while also ensuring that primary key constraints established at the model level are accurately reported in the documentation. This improvement would streamline the documentation process and provide comprehensive information about the primary keys used in the data models.

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

dbeatty10 commented 4 months ago

Thanks for reaching out @salimmoulouel !

It sounds like you have two asks (please correct me if I'm misunderstanding!):

  1. Display the model-level constraints in the documentation generated by dbt
  2. Ability to define multiple primary key constraints at the column level

Assuming that complete documentation is your main ask here, I'm going to transfer this issue to dbt-docs for further consideration.

1. Display the model-level constraints in the documentation generated by dbt

You want the documentation website (dbt docs generate && dbt docs serve) to display a "PK" label for each of those columns, possibly like this?

image

2. Multiple primary key constraints at the column level

I don't remember off the top of my head, but I think there are technical reasons for us not supporting this. Feel free to open up a separate issue for this one if you'd like, but I'm guessing we'd close as "won't do".

### Reprex `models/my_model.sql` ```sql select 1 as pk_1, 2 as pk_2 ``` `models/_models.yml` ```yaml models: - name: my_model config: materialized: table contract: enforced: true # model-level constraints constraints: - type: primary_key columns: [pk_1, pk_2] # column-level constraints columns: - name: pk_1 data_type: int constraints: - type: not_null - name: pk_2 data_type: int constraints: - type: not_null ``` Build and launch the dbt project docs: ```shell dbt docs generate && dbt docs serve ```
salimmoulouel commented 4 months ago

For the initial request: As of now, when specified at the column level, it displays accordingly. However, if declared at the model level constraint, it doesn't appear at either level. I wouldn't oppose consistently displaying it at the column level. What's essential to me is ensuring visibility of the primary key, especially when handling multiple primary key fields in BigQuery, which currently doesn't function when declared in the column level constraint.

dbeatty10 commented 4 months ago

Currently, the docs have five main sections:

  1. Details
  2. Description
  3. Columns (which contains column-level constraints)
  4. Depends on
  5. Code
image

And none of those sections includes model-level constraints like these:

image

Does that sound right? If so, would adding a new section for model-level constraints solve this for you?

dbeatty10 commented 4 months ago

Here's the files + commands I'm using to see how this is behaving currently:

### Reprex `models/dual.sql` ```sql {{ config(materialized="ephemeral") }} select 'X' as dummy ``` `models/my_other_model.sql` ```sql select 3 as other_pk_1, 4 as other_pk_2 from {{ ref("dual") }} ``` `models/my_model.sql` ```sql select 1 as pk_1, 2 as pk_2, 3 as fk_1, 4 as fk_2, 5 as check_1, 6 as check_2 from {{ ref("dual") }} ``` `models/_models.yaml` ```yaml models: - name: my_other_model config: materialized: table contract: enforced: true # model-level constraints: - type: primary_key columns: [other_pk_1, other_pk_2] - type: unique columns: [other_pk_1, other_pk_2] # column-level columns: - name: other_pk_1 data_type: int - name: other_pk_2 data_type: int - name: my_model config: materialized: table contract: enforced: true # model-level constraints: - type: primary_key columns: [pk_1, pk_2] - type: unique columns: [pk_1, pk_2] - type: foreign_key columns: [fk_1, fk_2] expression: "YOUR_SCHEMA_HERE.my_other_model (other_pk_1, other_pk_2)" - type: check columns: [check_1, check_2] expression: "check_1 != check_2" name: human_friendly_name # column-level columns: - name: pk_1 data_type: int - name: pk_2 data_type: int - name: fk_1 data_type: int - name: fk_2 data_type: int - name: check_1 data_type: int - name: check_2 data_type: int ``` ```shell dbt build --full-refresh dbt docs generate && dbt docs serve ```
salimmoulouel commented 4 months ago

sorry for the late answer, i Think yes, that would be perfect, thank you.

salimmoulouel commented 4 months ago

Do you believe this is a challenge we can resolve in the coming days? This matter pertains to a proof of concept (POC) that, if successful, will guide several teams within our company toward adopting dbt.

dbeatty10 commented 4 months ago

Summary

We'd be open to adding model-level constraints to the generated documentation website.

I've labeled this as refinement for us to determine how exactly the user experience would look for this and to determine acceptance criteria.

Timeline

This isn't a time-sensitive priority for us. So no, this is not something that would be available on the timeline you mentioned.

Workaround

However, we do have other capabilities that would allow you to add your own free-form content to the documentation website.

The description field can contain Markdown which provides some flexibility for how you want to format things.

Example

models:
  - name: my_other_model
    config:
      materialized: table
      contract:
        enforced: true

    description: >
      ### Model-level constraints

      - primary_key: [`other_pk_1`, `other_pk_2`]

      - unique: [`other_pk_1`, `other_pk_2`]

    # model-level
    constraints:
      - type: primary_key
        columns: [other_pk_1, other_pk_2]
      - type: unique
        columns: [other_pk_1, other_pk_2]
dbt docs generate && dbt docs serve

Screenshot

image