dbt-labs / dbt-codegen

Macros that generate dbt code
https://hub.getdbt.com/dbt-labs/codegen/latest/
Apache License 2.0
459 stars 99 forks source link

Sorted column support in generate_source #157

Closed abupp closed 2 weeks ago

abupp commented 7 months ago

Describe the feature

As a user I would like a new optional argument to the generate_source operation to sort columns.

Describe alternatives you've considered

Currently, I generate the source YAML and then copy and paste it into a spreadsheet if I want to sort it.

Additional context

I do not think that that feature would be database specific, though I have only tested it against Snowflake.

Who will this benefit?

This feature be particularly useful for tables that contain hundreds of columns, making them much easier to work with and maintain. I work for a company that manages data models related to Salesforce and Jira tables. Both systems support the creation and deletion of custom fields, and are moving targets that require regular maintenance of dbt models.

Are you interested in contributing this feature?

Sure, here's some suggested code:

In https://github.com/dbt-labs/dbt-codegen/blob/main/macros/generate_source.sql change line 62 from:

    {% set columns=adapter.get_columns_in_relation(table_relation) %}

to:

    {% set columns=adapter.get_columns_in_relation(table_relation) | sort(attribute='column') %}

This could be done conditionally if a the macro accepted an additional "sort_columns" argument (that defaulted to False):

    {% if sort_columns %}
        {% set columns=adapter.get_columns_in_relation(table_relation) | sort(attribute='column') %}
    {% else %}
        {% set columns=adapter.get_columns_in_relation(table_relation) %}
    {% endif %}        
dave-connors-3 commented 6 months ago

hey @abupp ! thanks for the issue -- i think this would be a great enhancement. Can you think of a scenario where someone would not want sorted columns? trying to decide if this should be the default or configurable

abupp commented 6 months ago

Greetings @dave-connors-3 You're quite welcome. Yes, I think there are likely to be scenarios where sorting would not be preferable, but when dealing w/ tables that have large column sets, I would not likely agree with the rationale. I have implemented sorting in our own dbt project, mostly it was not hard to do, and I found the branch test setup instructions were not very good for the Windows system that I run on. I could send you the code. I also adding sorting to the generate_base_model macro.

github-actions[bot] commented 3 weeks ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 2 weeks ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.