sqlfluff / sqlfluff

A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
https://www.sqlfluff.com
MIT License
7.86k stars 722 forks source link

sqlfluff fix with 2.0.7 duplicates macro #4808

Open gergelylendvai opened 1 year ago

gergelylendvai commented 1 year ago

Search before asking

What Happened

I tried to use the pre-commit hook with sqlfluff fix and change the following model

SELECT
    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},
    _IS_DELETED as IS_DELETED,
    _SYNCED_AT as SYNCED_AT
FROM {{ ref('my_super_long_name_src_for_reproducing_the_issue') }}

however it duplicated the first line with the dbt_utils.star macro

SELECT
    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }}        {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},
    _IS_DELETED AS IS_DELETED,
    _SYNCED_AT AS SYNCED_AT
FROM {{ ref('my_super_long_name_src_for_reproducing_the_issue') }}

It's pretty similar to this issue, however the solution seems to not cover this case.

Expected Behaviour

The expected behaviour would be not to duplicate the line containing the macro.

Observed Behaviour

Here are the logs I've collected:

[WARNING] Unstaged files detected.
[INFO] Stashing unstaged files to /Users/gergelylendvai/.cache/pre-commit/patch1682353553-43296.
sqlfluff-fix.............................................................Failed
- hook id: sqlfluff-fix
- files were modified by this hook

==== sqlfluff ====
sqlfluff:                2.0.7 python:                 3.8.16
implementation:        cpython verbosity:                   4
dialect:             snowflake templater:                 dbt
dbt:                    =1.4.6
rules:                                all
== Raw Config:
core:                                   
    dialect:            snowflake           
    disable_noqa:       False               
    encoding:           autodetect          
    fix_even_unparsable:False               
    ignore:             []                  
    ignore_templated_areas:True                
    large_file_skip_byte_limit:20000               
    large_file_skip_char_limit:0                   
    max_line_length:    80                  
    nocolor:            False               
    output_line_length: 80                  
    processes:          1                   
    recurse:            True                
    rules:              all                 
    runaway_limit:      10                  
    sql_file_exts:      .sql,.sql.j2,.dml,.ddl
    templater:          dbt                 
    verbose:            4                   
    warnings:           []                  
indentation:                            
    allow_implicit_indents:False               
    indent_unit:        space               
    indented_ctes:      False               
    indented_joins:     False               
    indented_on_contents:True                
    indented_then:      True                
    indented_using_on:  True                
    skip_indentation_in:script_content      
    tab_space_size:     4                   
    template_blocks_indent:True                
    trailing_comments:  before              
layout:                                 
    type:                                   
        array_accessor:                         
            spacing_before:     touch:inline        
        array_type:                             
            spacing_within:     touch:inline        
        assignment_operator:                    
            line_position:      leading             
            spacing_within:     touch               
        binary_operator:                        
            line_position:      leading             
            spacing_within:     touch               
        bracketed_arguments:                    
            spacing_before:     touch:inline        
        casting_operator:                       
            spacing_after:      touch:inline        
            spacing_before:     touch               
        colon:                                  
            spacing_before:     touch               
        comma:                                  
            line_position:      trailing            
            spacing_before:     touch               
        comment:                                
            spacing_after:      any                 
            spacing_before:     any                 
        common_table_expression:                    
            spacing_within:     single:inline       
        comparison_operator:                    
            line_position:      leading             
            spacing_within:     touch               
        end_angle_bracket:                      
            spacing_before:     touch               
        end_bracket:                            
            spacing_before:     touch               
        end_of_file:                            
            spacing_before:     touch               
        end_square_bracket:                     
            spacing_before:     touch               
        from_clause:                            
            line_position:      alone               
        function_name:                          
            spacing_after:      touch:inline        
            spacing_within:     touch:inline        
        groupby_clause:                         
            line_position:      alone               
        having_clause:                          
            line_position:      alone               
        join_clause:                            
            line_position:      alone               
        limit_clause:                           
            line_position:      alone               
        numeric_literal:                        
            spacing_within:     touch:inline        
        object_reference:                       
            spacing_within:     touch:inline        
        orderby_clause:                         
            line_position:      leading             
        placeholder:                            
            spacing_after:      any                 
            spacing_before:     any                 
        select_clause:                          
            line_position:      alone               
        semi_structured_expression:                    
            spacing_before:     touch:inline        
            spacing_within:     touch:inline        
        set_operator:                           
            line_position:      alone:strict        
        sign_indicator:                         
            spacing_after:      touch:inline        
        sized_array_type:                       
            spacing_within:     touch               
        slice:                                  
            spacing_after:      touch               
            spacing_before:     touch               
        start_angle_bracket:                    
            spacing_after:      touch               
        start_bracket:                          
            spacing_after:      touch               
        start_square_bracket:                    
            spacing_after:      touch               
        statement_terminator:                    
            line_position:      trailing            
            spacing_before:     touch               
        struct_type:                            
            spacing_within:     touch:inline        
        template_loop:                          
            spacing_after:      any                 
            spacing_before:     any                 
        tilde:                                  
            spacing_after:      touch:inline        
        typed_array_literal:                    
            spacing_within:     touch               
        typed_struct_literal:                    
            spacing_within:     touch               
        where_clause:                           
            line_position:      alone               
rules:                                  
    allow_scalar:       True                
    single_table_references:consistent          
    unquoted_identifiers_policy:all                 
    aliasing.column:                        
        aliasing:           explicit            
    aliasing.forbid:                        
        force_enable:       False               
    aliasing.length:                        
    aliasing.table:                         
        aliasing:           explicit            
    ambiguous.column_references:                    
        group_by_and_order_by_style:consistent          
    ambiguous.join:                         
        fully_qualify_join_types:inner               
    capitalisation.functions:                    
        extended_capitalisation_policy:consistent          
    capitalisation.identifiers:                    
        extended_capitalisation_policy:consistent          
    capitalisation.keywords:                    
        capitalisation_policy:consistent          
    capitalisation.literals:                    
        capitalisation_policy:consistent          
    capitalisation.types:                    
        extended_capitalisation_policy:consistent          
    convention.blocked_words:                    
        match_source:       False               
    convention.casting_style:                    
        preferred_type_casting_style:consistent          
    convention.count_rows:                    
        prefer_count_0:     False               
        prefer_count_1:     False               
    convention.quoted_literals:                    
        force_enable:       False               
        preferred_quoted_literal_style:consistent          
    convention.select_trailing_comma:                    
        select_clause_trailing_comma:forbid              
    convention.terminator:                    
        multiline_newline:  False               
        require_final_semicolon:False               
    layout.long_lines:                      
        ignore_comment_clauses:False               
        ignore_comment_lines:False               
    layout.select_targets:                    
        wildcard_policy:    single              
    references.consistent:                    
        force_enable:       False               
    references.from:                        
        force_enable:       False               
    references.keywords:                    
        unquoted_identifiers_policy:aliases             
    references.qualification:                    
    references.quoting:                     
        force_enable:       False               
        prefer_quoted_identifiers:False               
        prefer_quoted_keywords:False               
    references.special_chars:                    
        allow_space_in_identifier:False               
        quoted_identifiers_policy:all                 
        unquoted_identifiers_policy:all                 
    structure.subquery:                     
        forbid_subquery_in: join                
templater:                              
    unwrap_wrapped_queries:True                
    dbt:                                    
        apply_dbt_builtins: True                
        profiles_dir:       /Users/gergelylendvai/datalake/
        project_dir:        /Users/gergelylendvai/datalake/
    jinja:                                  
        apply_dbt_builtins: True                

==== finding fixable violations ====
FORCE MODE: Attempting fixes...
effective configured processes:  8
=== [dbt templater] Sorting Nodes...                        
INFO       TEMPLATING RAW [dbt] (datalake/models/my_model.sql)
=== [dbt templater] Compiling dbt project...                
=== [dbt templater] Project Compiled.                       
DEBUG      _find_node for path '/Users/gergelylendvai/datalake/models/my_model.sql' returned object of type <class 'dbt.contracts.graph.nodes.ModelNode'>.
DEBUG          Trailing newline count in source dbt model: 2
DEBUG          Raw SQL before compile: "SELECT\n    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM {{ ref('my_super_long_name_src_for_reproducing_the_issue') }}\n\n"
DEBUG          Node raw SQL: "SELECT\n    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM {{ ref('my_super_long_name_src_for_reproducing_the_issue') }}"
DEBUG          Node compiled SQL: 'SELECT\n    "ID",\n  "DISPLAY",\n  "_IS_DELETED",\n  "_SYNCED_AT",\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue'
INFO       Slicing File Template                            
DEBUG          Raw String: "SELECT\n    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM {{ ref('my_super_long_name_src_for_reproducing_the_issue') }}\n\n"
DEBUG          Templated String: 'SELECT\n    "ID",\n  "DISPLAY",\n  "_IS_DELETED",\n  "_SYNCED_AT",\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue\n\n'
INFO       TEMPLATING RAW [dbt] (datalake/models/my_super_long_name_src_for_reproducing_the_issue.sql)
DEBUG      _find_node for path '/Users/gergelylendvai/models/my_super_long_name_src_for_reproducing_the_issue.sql' returned object of type <class 'dbt.contracts.graph.nodes.ModelNode'>.
DEBUG          Trailing newline count in source dbt model: 1
DEBUG          Raw SQL before compile: "SELECT\n    ID,\n    DISPLAY,\n    _FIVETRAN_DELETED AS _IS_DELETED,\n    _FIVETRAN_SYNCED AS _SYNCED_AT\nFROM {{ source('carriers_and_drivers_tms', 'tms_amplitude_event_type') }}\n"
DEBUG          Node raw SQL: "SELECT\n    ID,\n    DISPLAY,\n    _FIVETRAN_DELETED AS _IS_DELETED,\n    _FIVETRAN_SYNCED AS _SYNCED_AT\nFROM {{ source('carriers_and_drivers_tms', 'tms_amplitude_event_type') }}"
DEBUG          Node compiled SQL: 'SELECT\n    ID,\n    DISPLAY,\n    _FIVETRAN_DELETED AS _IS_DELETED,\n    _FIVETRAN_SYNCED AS _SYNCED_AT\nFROM fivetran.tms_amplitude.event_type'
INFO       Slicing File Template                            
DEBUG          Raw String: "SELECT\n    ID,\n    DISPLAY,\n    _FIVETRAN_DELETED AS _IS_DELETED,\n    _FIVETRAN_SYNCED AS _SYNCED_AT\nFROM {{ source('carriers_and_drivers_tms', 'tms_amplitude_event_type') }}\n"
DEBUG          Templated String: 'SELECT\n    ID,\n    DISPLAY,\n    _FIVETRAN_DELETED AS _IS_DELETED,\n    _FIVETRAN_SYNCED AS _SYNCED_AT\nFROM fivetran.tms_amplitude.event_type\n'
== [datalake/models/my_super_long_name_src_for_reproducing_the_issue.sql] PASS
== [datalake/models/my_super_long_name_src_for_reproducing_the_issue.sql] SKIP
file datalake/models/my_super_long_name_src_for_reproducing_the_issue.sql:  50% 1/2 [00:11<00:11, 11.27s/it]One fix for LT05 not applied, it would re-cause the same error.
== [datalake/models/my_model.sql] FAIL                                                                      
L:   2 | P:   5 | LT05 | Line is too long (87 > 80).
                       | [layout.long_lines]
L:   3 | P:  17 | CP01 | Keywords must be consistently upper case.
                       | [capitalisation.keywords]
L:   4 | P:  16 | CP01 | Keywords must be consistently upper case.
                       | [capitalisation.keywords]
L:   6 | P:   1 | LT12 | Files must end with a single trailing newline.
                       | [layout.end_of_file]
DEBUG      Original Tree: 'SELECT\n    "ID",\n  "DISPLAY",\n  "_IS_DELETED",\n  "_SYNCED_AT",\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue\n\n'
DEBUG      Fixed Tree: 'SELECT\n    "ID",\n          "DISPLAY",\n          "_IS_DELETED",\n          "_SYNCED_AT",\n    _IS_DELETED AS IS_DELETED,\n    _SYNCED_AT AS SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue\n'
DEBUG      ### Templated File.                                                                              
DEBUG          File slice: 0 TemplatedFileSlice(slice_type='literal', source_slice=slice(0, 11, None), templated_slice=slice(0, 11, None)) [invariant]
DEBUG          File slice: 1 TemplatedFileSlice(slice_type='templated', source_slice=slice(11, 93, None), templated_slice=slice(11, 61, None))
DEBUG                           templated: '"ID",\n  "DISPLAY",\n  "_IS_DELETED",\n  "_SYNCED_AT"'      source: "{{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }}"
DEBUG          File slice: 2 TemplatedFileSlice(slice_type='literal', source_slice=slice(93, 159, None), templated_slice=slice(61, 127, None)) [invariant]
DEBUG          File slice: 3 TemplatedFileSlice(slice_type='templated', source_slice=slice(159, 220, None), templated_slice=slice(127, 204, None))
DEBUG                           templated: 'DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue'      source: "{{ ref('my_super_long_name_src_for_reproducing_the_issue') }}"
DEBUG          File slice: 4 TemplatedFileSlice(slice_type='literal', source_slice=slice(220, 222, None), templated_slice=slice(204, 206, None)) [invariant]
DEBUG      ### Beginning Patch Iteration.                                                                   
DEBUG      # Changed Segment Found: FileSegment at slice(0, 206, None): Original: ['SELECT\n    "ID",\n  "DISPLAY",\n  "_IS_DELETED",\n  "_SYNCED_AT",\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue\n\n'] Fixed: ['SELECT\n    "ID",\n          "DISPLAY",\n          "_IS_DELETED",\n          "_SYNCED_AT",\n    _IS_DELETED AS IS_DELETED,\n    _SYNCED_AT AS SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue\n']
DEBUG      # Changed Segment Found: StatementSegment at slice(0, 204, None): Original: ['SELECT\n    "ID",\n  "DISPLAY",\n  "_IS_DELETED",\n  "_SYNCED_AT",\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue'] Fixed: ['SELECT\n    "ID",\n          "DISPLAY",\n          "_IS_DELETED",\n          "_SYNCED_AT",\n    _IS_DELETED AS IS_DELETED,\n    _SYNCED_AT AS SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue']
DEBUG      # Changed Segment Found: SelectStatementSegment at slice(0, 204, None): Original: ['SELECT\n    "ID",\n  "DISPLAY",\n  "_IS_DELETED",\n  "_SYNCED_AT",\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue'] Fixed: ['SELECT\n    "ID",\n          "DISPLAY",\n          "_IS_DELETED",\n          "_SYNCED_AT",\n    _IS_DELETED AS IS_DELETED,\n    _SYNCED_AT AS SYNCED_AT\nFROM DATALAKE_DEV.GERGELY_LENDVAI.my_super_long_name_src_for_reproducing_the_issue']
DEBUG      # Changed Segment Found: SelectClauseSegment at slice(0, 121, None): Original: ['SELECT\n    "ID",\n  "DISPLAY",\n  "_IS_DELETED",\n  "_SYNCED_AT",\n    _IS_DELETED as IS_DELETED,\n    _SYNCED_AT as SYNCED_AT'] Fixed: ['SELECT\n    "ID",\n          "DISPLAY",\n          "_IS_DELETED",\n          "_SYNCED_AT",\n    _IS_DELETED AS IS_DELETED,\n    _SYNCED_AT AS SYNCED_AT']
DEBUG      Appending insertion buffer. '    ' @idx: 17                                                      
DEBUG        0 Yielded patch: FixPatch(templated_slice=slice(17, 17, None), fixed_raw='    ', patch_category='mid_point', source_slice=slice(93, 93, None), templated_str='', source_str='')
DEBUG              Templated Hint: ...'    "ID",\n' <> '  "DISPLAY'...                                      
INFO             * Keeping patch on new or literal-only section.                                            
DEBUG      # Changed Segment Found: WhitespaceSegment at slice(17, 17, None): Original: [''] Fixed: ['    ']
DEBUG        1 Yielded patch: FixPatch(templated_slice=slice(17, 17, None), fixed_raw='    ', patch_category='literal', source_slice=slice(93, 11, None), templated_str='', source_str='')
DEBUG              Templated Hint: ...'    "ID",\n' <> '  "DISPLAY'...                                      
INFO             * Keeping patch on new or literal-only section.                                            
DEBUG      Appending insertion buffer. '    ' @idx: 30                                                      
DEBUG        2 Yielded patch: FixPatch(templated_slice=slice(30, 30, None), fixed_raw='    ', patch_category='mid_point', source_slice=slice(93, 93, None), templated_str='', source_str='')
DEBUG              Templated Hint: ...'DISPLAY",\n' <> '  "_IS_DEL'...                                      
INFO             - Skipping. Source space Duplicate: (slice(93, 93, None), '    ')                          
DEBUG      # Changed Segment Found: WhitespaceSegment at slice(30, 30, None): Original: [''] Fixed: ['    ']
DEBUG        3 Yielded patch: FixPatch(templated_slice=slice(30, 30, None), fixed_raw='    ', patch_category='literal', source_slice=slice(93, 11, None), templated_str='', source_str='')
DEBUG              Templated Hint: ...'DISPLAY",\n' <> '  "_IS_DEL'...                                      
INFO             - Skipping. Source space Duplicate: (slice(93, 11, None), '    ')                          
DEBUG      Appending insertion buffer. '    ' @idx: 47                                                      
DEBUG        4 Yielded patch: FixPatch(templated_slice=slice(47, 47, None), fixed_raw='    ', patch_category='mid_point', source_slice=slice(93, 93, None), templated_str='', source_str='')
DEBUG              Templated Hint: ...'DELETED",\n' <> '  "_SYNCED'...                                      
INFO             - Skipping. Source space Duplicate: (slice(93, 93, None), '    ')                          
DEBUG      # Changed Segment Found: WhitespaceSegment at slice(47, 47, None): Original: [''] Fixed: ['    ']
DEBUG        5 Yielded patch: FixPatch(templated_slice=slice(47, 47, None), fixed_raw='    ', patch_category='literal', source_slice=slice(93, 11, None), templated_str='', source_str='')
DEBUG              Templated Hint: ...'DELETED",\n' <> '  "_SYNCED'...                                      
INFO             - Skipping. Source space Duplicate: (slice(93, 11, None), '    ')                          
DEBUG      # Changed Segment Found: SelectClauseElementSegment at slice(67, 92, None): Original: ['_IS_DELETED as IS_DELETED'] Fixed: ['_IS_DELETED AS IS_DELETED']
DEBUG        6 Yielded patch: FixPatch(templated_slice=slice(67, 92, None), fixed_raw='_IS_DELETED AS IS_DELETED', patch_category='literal', source_slice=slice(99, 124, None), templated_str='_IS_DELETED as IS_DELETED', source_str='_IS_DELETED as IS_DELETED')
DEBUG              Templated Hint: ...'_AT",\n    ' <> ',\n    _SYN'...                                     
INFO             * Keeping patch on new or literal-only section.                                            
DEBUG      # Changed Segment Found: SelectClauseElementSegment at slice(98, 121, None): Original: ['_SYNCED_AT as SYNCED_AT'] Fixed: ['_SYNCED_AT AS SYNCED_AT']
DEBUG        7 Yielded patch: FixPatch(templated_slice=slice(98, 121, None), fixed_raw='_SYNCED_AT AS SYNCED_AT', patch_category='literal', source_slice=slice(130, 153, None), templated_str='_SYNCED_AT as SYNCED_AT', source_str='_SYNCED_AT as SYNCED_AT')
DEBUG              Templated Hint: ...'ETED,\n    ' <> '\nFROM DATA'...                                     
INFO             * Keeping patch on new or literal-only section.                                            
DEBUG        8 Yielded patch: FixPatch(templated_slice=slice(204, 205, None), fixed_raw='', patch_category='mid_point', source_slice=slice(220, 221, None), templated_str='', source_str='')
DEBUG              Templated Hint: ...'_the_issue' <> '\n'...                                               
INFO             * Keeping patch on new or literal-only section.                                            
DEBUG      Filtered source patches:                                                                         
DEBUG          0: FixPatch(templated_slice=slice(17, 17, None), fixed_raw='    ', patch_category='mid_point', source_slice=slice(93, 93, None), templated_str='', source_str='')
DEBUG          1: FixPatch(templated_slice=slice(17, 17, None), fixed_raw='    ', patch_category='literal', source_slice=slice(93, 11, None), templated_str='', source_str='')
DEBUG          2: FixPatch(templated_slice=slice(67, 92, None), fixed_raw='_IS_DELETED AS IS_DELETED', patch_category='literal', source_slice=slice(99, 124, None), templated_str='_IS_DELETED as IS_DELETED', source_str='_IS_DELETED as IS_DELETED')
DEBUG          3: FixPatch(templated_slice=slice(98, 121, None), fixed_raw='_SYNCED_AT AS SYNCED_AT', patch_category='literal', source_slice=slice(130, 153, None), templated_str='_SYNCED_AT as SYNCED_AT', source_str='_SYNCED_AT as SYNCED_AT')
DEBUG          4: FixPatch(templated_slice=slice(204, 205, None), fixed_raw='', patch_category='mid_point', source_slice=slice(220, 221, None), templated_str='', source_str='')
DEBUG      Source-only slices: []                                                                           
DEBUG      Final slice buffer: [slice(0, 93, None), slice(93, 93, None), slice(93, 11, None), slice(11, 99, None), slice(99, 124, None), slice(124, 130, None), slice(130, 153, None), slice(153, 220, None), slice(220, 221, None), slice(221, 222, None)]
DEBUG      Appending Raw:                    slice(0, 93, None)     "SELECT\n    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }}"
DEBUG      Appending mid_point Patch:        slice(93, 93, None)    '' > '    '                             
DEBUG      Appending literal Patch:          slice(93, 11, None)    '' > '    '                             
DEBUG      Appending Raw:                    slice(11, 99, None)     "{{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},\n    "
DEBUG      Appending literal Patch:          slice(99, 124, None)    '_IS_DELETED as IS_DELETED' > '_IS_DELETED AS IS_DELETED'
DEBUG      Appending Raw:                    slice(124, 130, None)     ',\n    '                            
DEBUG      Appending literal Patch:          slice(130, 153, None)    '_SYNCED_AT as SYNCED_AT' > '_SYNCED_AT AS SYNCED_AT'
DEBUG      Appending Raw:                    slice(153, 220, None)     "\nFROM {{ ref('my_super_long_name_src_for_reproducing_the_issue') }}"
DEBUG      Appending mid_point Patch:        slice(220, 221, None)    '' > ''                               
DEBUG      Appending Raw:                    slice(221, 222, None)     '\n'                                 
== [datalake/models/my_model.sql] FIXED                                                                     
4 fixable linting violations found                                                                          
==== lint for unfixable violations ====
== [datalake/models/my_model.sql] PASS
== [datalake/models/my_super_long_name_src_for_reproducing_the_issue.sql] PASS

[INFO] Restored changes from /Users/gergelylendvai/.cache/pre-commit/patch1682353553-43296.

and I think this is the relevant part:

DEBUG      Final slice buffer: [slice(0, 93, None), slice(93, 93, None), slice(93, 11, None), slice(11, 99, None), slice(99, 124, None), slice(124, 130, None), slice(130, 153, None), slice(153, 220, None), slice(220, 221, None), slice(221, 222, None)]
DEBUG      Appending Raw:                    slice(0, 93, None)     "SELECT\n    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }}"
DEBUG      Appending mid_point Patch:        slice(93, 93, None)    '' > '    '                             
DEBUG      Appending literal Patch:          slice(93, 11, None)    '' > '    '                             
DEBUG      Appending Raw:                    slice(11, 99, None)     "{{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},\n    "
DEBUG      Appending literal Patch:          slice(99, 124, None)    '_IS_DELETED as IS_DELETED' > '_IS_DELETED AS IS_DELETED'
DEBUG      Appending Raw:                    slice(124, 130, None)     ',\n    '                            
DEBUG      Appending literal Patch:          slice(130, 153, None)    '_SYNCED_AT as SYNCED_AT' > '_SYNCED_AT AS SYNCED_AT'
DEBUG      Appending Raw:                    slice(153, 220, None)     "\nFROM {{ ref('my_super_long_name_src_for_reproducing_the_issue') }}"
DEBUG      Appending mid_point Patch:        slice(220, 221, None)    '' > ''                               
DEBUG      Appending Raw:                    slice(221, 222, None)     '\n'                                 

How to reproduce

Versions:

pre-commit-config.yaml

repos:
- repo: https://github.com/sqlfluff/sqlfluff
  rev: 2.0.7
  hooks:
    - id: sqlfluff-lint
      # For dbt projects, this installs the dbt "extras".
      # You will need to select the relevant dbt adapter for your dialect
      # (https://docs.getdbt.com/docs/available-adapters):
      # args: [--dialect, snowflake]
      args: [-vvvv]
      additional_dependencies: ['dbt-snowflake==1.4.2', 'sqlfluff-templater-dbt==2.0.7']
    - id: sqlfluff-fix
      # Arbitrary arguments to show an example
      # args: [--rules, "LT02,CP02"]
      # args: [--dialect, snowflake]
      args: [-vvvv]
      additional_dependencies: ['dbt-snowflake==1.4.2', 'sqlfluff-templater-dbt==2.0.7']

.sqlfluff

[sqlfluff]
# Supported dialects https://docs.sqlfluff.com/en/stable/dialects.html
# Or run 'sqlfluff dialects'
dialect = snowflake
templater = dbt

[sqlfluff:templater:dbt]
project_dir = ./datalake/
profiles_dir = ./datalake/
apply_dbt_builtins = true

my_model.sql

SELECT
    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},
    _FIVETRAN_DELETED AS IS_DELETED,
    _FIVETRAN_SYNCED AS FIVETRAN_SYNCED_AT
FROM {{ source('my_super_long_name_src_for_reproducing_the_issue') }}

Run:

brew install poetry

poetry init
poetry add pre-commit
poetry install
poetry shell

pre-commit install
pre-commit run sqlfluff-fix

Dialect

snowflake

Version

2.0.7

Configuration

pre-commit-config.yaml

repos:
- repo: https://github.com/sqlfluff/sqlfluff
  rev: 2.0.7
  hooks:
    - id: sqlfluff-lint
      # For dbt projects, this installs the dbt "extras".
      # You will need to select the relevant dbt adapter for your dialect
      # (https://docs.getdbt.com/docs/available-adapters):
      # args: [--dialect, snowflake]
      args: [-vvvv]
      additional_dependencies: ['dbt-snowflake==1.4.2', 'sqlfluff-templater-dbt==2.0.7']
    - id: sqlfluff-fix
      # Arbitrary arguments to show an example
      # args: [--rules, "LT02,CP02"]
      # args: [--dialect, snowflake]
      args: [-vvvv]
      additional_dependencies: ['dbt-snowflake==1.4.2', 'sqlfluff-templater-dbt==2.0.7']

.sqlfluff

[sqlfluff]
# Supported dialects https://docs.sqlfluff.com/en/stable/dialects.html
# Or run 'sqlfluff dialects'
dialect = snowflake
templater = dbt

[sqlfluff:templater:dbt]
project_dir = ./datalake/
profiles_dir = ./datalake/
apply_dbt_builtins = true

my_model.sql

SELECT
    {{ dbt_utils.star(from=ref('my_super_long_name_src_for_reproducing_the_issue')) }},
    _FIVETRAN_DELETED AS IS_DELETED,
    _FIVETRAN_SYNCED AS FIVETRAN_SYNCED_AT
FROM {{ source('my_super_long_name_src_for_reproducing_the_issue') }}

Are you willing to work on and submit a PR to address the issue?

Code of Conduct

jens-koster commented 1 year ago

Hi, I have found a similar issue, also related to the length of the line. sqlfluff version 2.1.1 dbt-core: 1.3.3 dbt-bigquery: 1.3.0

Sqlfluff fix on the following code:

select
    {{ dbt_utils.generate_surrogate_key(['abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc']) }} as dummy

renders the result:

select

        {{ dbt_utils.generate_surrogate_key(['abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc']) }}
{{ dbt_utils.generate_surrogate_key(['abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc']) }}
        as dummy

it adds a bunch of newlines and duplicates the jinja code. This only happens when the line is long enough to trigger LT05 too long line. If I shorten the line the bug is not triggered, if I add another linting error that gets fixed it still does not trigger the bug. this is the output when the bug is triggered, there is only LT05 being fixed.

== [models/staging/typeform/stg_typeform__logic_action_condition.sql] FAIL                                                                                                                                                                      
L:   2 | P:   5 | LT05 | Line is too long (213 > 140).
                       | [layout.long_lines]
==== fixing violations ====
1 fixable linting violations found
Are you sure you wish to attempt to fix these? [Y/n] ...
awoehrl commented 11 months ago

I think the same is happening in my code. Formatting duplicated the coalesce_columns_by_prefix Jinja macro:

  from {{ ref('snowplow_ecommerce_base_events_this_run') }} as t,
    UNNEST(
      {{ snowplow_ecommerce.coalesce_columns_by_prefix(ref('snowplow_ecommerce_base_events_this_run'), 'contexts_com_snowplowanalytics_snowplow_ecommerce_product_1') }}
{{ snowplow_ecommerce.coalesce_columns_by_prefix(ref('snowplow_ecommerce_base_events_this_run'), 'contexts_com_snowplowanalytics_snowplow_ecommerce_product_1') }}
    )
      as r
    with offset as index

)
grantith commented 7 months ago

I'm encountering this, too, on 3.0.1