dbt-msft / dbt-sqlserver

dbt adapter for SQL Server and Azure SQL
MIT License
216 stars 101 forks source link

1.7.2 does not include clustered columnstore indexes on SQL Server 2019 tables #473

Closed CamronBorealis closed 3 months ago

CamronBorealis commented 9 months ago

We upgraded to the latest version of dbt-sqlserver, tables are not being created with clustered columnstore indexes anymore

jjquist commented 9 months ago

Same issue here. Was working in v1.4.*.

vvvito commented 4 months ago

We upgraded to 1.7 and also found this issue, caused our pipelines to grind to a halt trying to process the models which were now effectively heaps.

We developed a workaround - in the interim whilst we wait for the PR resolving this issue is merged. It's a macro that's based on the original code that was running on 1.4, but we just extracted it from the source and added it to our dbt project.

[!WARNING] Only tested on SQL Server 2016 and 2022. You might have to tweak the query to suit your environment if it's running older SQL version

{% macro create_clustered_columnstore_index() -%}
    {%- set cci_name = (this.schema ~ '_' ~ this.identifier ~ '_cci') | replace(".", "") | replace(" ", "") -%}
    {%- set relation_name = this.schema ~ '_' ~ this.identifier -%}
    {%- set full_relation = '[' ~ this.schema ~ '].[' ~ this.identifier ~ ']' -%}
    {%- set as_columnstore = config.get('as_columnstore', default = true) -%}
    {%- set materialized_as = config.get('materialized') -%}
    {% if materialized_as != 'view' and as_columnstore %}
    use [{{ this.database }}];
        -- If there is no existing clustered index or columnstore, then create it.
        IF NOT EXISTS (
            SELECT 1 
            FROM sys.indexes (NOLOCK)
            WHERE object_id = OBJECT_ID( '{{this.schema}}.{{this.identifier}}' ) 
                AND (
                        type in ( 1, 5 )
                    OR  name = '{{cci_name}}'
                )
        )
        BEGIN
            --DROP INDEX IF EXISTS {{full_relation}}.{{cci_name}}
            CREATE CLUSTERED COLUMNSTORE INDEX {{cci_name}} ON {{full_relation}}
        END
    {% endif%}
{% endmacro %}
. . .
models:
  my_project
    +materialized: table
    core:
      +schema: dim
      +post-hook: 
        - "{{ create_clustered_columnstore_index () }}"
        - . . .
CamronBorealis commented 3 months ago

This has been fixed in 1.8.0rc1