dbt-labs / dbt-utils

Utility functions for dbt projects.
https://hub.getdbt.com/dbt-labs/dbt_utils/latest/
Apache License 2.0
1.33k stars 487 forks source link

Add a sort parameter to star.sql #937

Open gfasano opened 1 month ago

gfasano commented 1 month ago

Describe the feature

Add a sort paramater to the dbt_utils.star() that will sort in alphabetical order the return of the column name

Describe alternatives you've considered

My use case was to do UNION ALL on severall intermediary queries (not materialized data) with large amount of columns (arround 130) not ordered.

I tried to use the dbt_utils.union_relation but it work only on materialized data on I founded that overkilled to stock in database intermediary data just to do an UNION.

What I am currently doing to avoid that is as follow :

{% set column_list = dbt_utils.get_filtered_columns_in_relation( ref('original_data'))|sort %}

with partial_table_transformed as (
SELECT 
      *,
      [transformation to compute new_column_A and new_column_B, that changed the order of all the original column such as join or group by]
FROM {{ ref('original_data') }}
WHERE need_to_be_transformed = 1
),

partial_table_not_transformed as (
SELECT 
      *
FROM {{ ref('original_data') }}
WHERE need_to_be_transformed = 0
),

SELECT
    {{ column_list | join(', ') }}, 
    new_column_A, 
    new_column_B
FROM partial_table_transformed
UNION ALL
SELECT
    {{ column_list | join(', ') }},
    NULL as new_column_A, 
    NULL as new_column_B, 
FROM partial_table_not_transformed

It would be great to do as such :

SELECT
    {{ dbt_utils.star(ref('original_data'), sort = True) }}
    new_column_A, 
    new_column_B
FROM partial_table_transformed
UNION ALL
SELECT
    {{ dbt_utils.star(ref('original_data'), sort = True) }}
    NULL as new_column_A, 
    NULL as new_column_B, 
FROM partial_table_not_transformed

Who will this benefit?

People dealing with huge amount of columns in very large dbt models. It could help:

Are you interested in contributing this feature?

Yes, I have already a branch adding this feature here

gfasano commented 1 month ago

I created a pull request here #939 Add a sort parameter to star.sql