dbt-labs / dbt-codegen

Macros that generate dbt code
https://hub.getdbt.com/dbt-labs/codegen/latest/
Apache License 2.0
464 stars 102 forks source link

Expected result of nested struct in BigQuery #105

Closed dbeatty10 closed 1 year ago

dbeatty10 commented 1 year ago

resolves #98

This is a:

All pull requests from community contributors should target the main branch (default).

Checklist

Zatte commented 1 year ago

Adding this as a note as it is more of feature/discussion than a bugfix.

I've seen that you can document nested columns in multiples ways.

One way being

models:
  - name: model_struct
    description: ""
    columns:
      - name: analytics
      - name: analytics.source
      - name: analytics.medium
      - name: analytics.source_medium

But there is also the possibility to do (note the nested use of the key columns inside a column)

models:
  - name: model_struct
    columns:
      - name: analytics
        columns:
          - name: source
          - name: medium
          - name: source_medium

Since this macro makes favors of one over the other it would be nice to make a small note about this decision. I believe both to be identical but I haven't looked into it sufficiently to be sure.

A feature request could be to support both or deprecate one method upstream(dbt-core) to encourage consistency. At the very least document (in the official docs) the canonical way for documenting nested fields (might be done but my google-fu wasn't strong enough).

dbeatty10 commented 1 year ago

I've seen that you can document nested columns in multiples ways.

Good point!

In my experiments, the two variants behaved differently when adding descriptions and then generating the documentation:

dbt docs generate
dbt docs serve

In my quick tests, your first example rendered the descriptions, but the second didn't.

Are you able to get it to include the descriptions when generating and serving the docs?

Here's the exact code I used:

models/model_struct.sql

select 
   struct(
      "a1" as source, 
      "b1" as medium, 
      "c1" as source_medium
   ) as analytics

Worked: models/_models.yml

version: 2

models:
  - name: model_struct
    columns:
      - name: analytics
        description: "This is the name of the STRUCT"
      - name: analytics.source
        description: "This is the first attribute in the STRUCT"
      - name: analytics.medium
        description: "This is the second attribute in the STRUCT"
      - name: analytics.source_medium
        description: "This is the third attribute in the STRUCT"
image

Didn't work: models/_models.yml

version: 2

models:
  - name: model_struct
    columns:
      - name: analytics
        description: "This is the name of the STRUCT"
        columns:
          - name: source
            description: "This is the first attribute in the STRUCT"
          - name: medium
            description: "This is the second attribute in the STRUCT"
          - name: source_medium
            description: "This is the third attribute in the STRUCT"
image
Zatte commented 1 year ago

Are you able to get it to include the descriptions when generating and serving the docs?

No, you are correct, the current macro approach works better and they are not the same! Thanks for clarifying and the great velocity on this Issue/PR 💯

dbeatty10 commented 1 year ago

Thanks for reporting this and all your detailed information @Zatte ! Wouldn't have happened without you 🏅