sqlfluff / sqlfluff

A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
https://www.sqlfluff.com
MIT License
7.93k stars 729 forks source link

How to make .sqlfluff customizations reflected in git commit run #5136

Open Sugatoshifumi opened 1 year ago

Sugatoshifumi commented 1 year ago

Search before asking

What Happened

Rules that were picked up by local .sqlfluff customization are not picked up during git commit execution. The following bug report may be a little close to my report. https://github.com/sqlfluff/sqlfluff/issues/4304

I've written this at length, but as I mentioned in the title, if I could figure out how to customize the sqlfluff rules that are applied when a git commit is executed, I think I could solve this problem.

Expected Behaviour

The same content that is locally linted can be detected in a git commit.

Observed Behaviour

As stated in the title. Errors in lint displayed when running with .sqlfluff differ from errors when running with commit

Created files

.pre-commit-config.yaml

repos:
- repo: https://github.com/sqlfluff/sqlfluff
  rev: 2.3.0
  hooks:
    - id: sqlfluff-lint
      additional_dependencies : ['dbt-bigquery==1.6.0', 'sqlfluff-templater-dbt']
      log_file : 'true'
    - id: sqlfluff-fix
      additional_dependencies : ['dbt-bigquery==1.6.0', 'sqlfluff-templater-dbt']
      log_file : 'true'

sample sql(orders_payments.sql)

target_date is defined in dbt_project.yml

SELECT 
  c.id, 
  c.first_name
  ,c.last_name, 
  SUM(p.amount) AS total_amount
FROM 
  {{ source('main', 'raw_customers') }} AS c
INNER JOIN 
  {{ source('main', 'raw_orders') }} AS o 
ON 
  c.id = o.user_id
INNER JOIN 
  {{ source('main', 'raw_payments') }} AS p 
ON o.id = p.order_id
{% if  target.name == "duck" %}

WHERE 
    order_date::date = '{{ var('target_date') }}'::date

{% else %}

WHERE order_date = '{{ var('target_date') }}'
{% endif %}
GROUP BY 
    c.id, c.first_name, c.last_name

Verification results

First, let the linter run manually, and the following message will be returned

(analysis_sample) bash-3.2$ sqlfluff lint models/orders_payments.sql --config .sqlfluff
=== [dbt templater] Sorting Nodes...
=== [dbt templater] Compiling dbt project...
=== [dbt templater] Project Compiled.
== [orders_payments.sql] FAIL                                                                        
L:   1 | P:   7 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   2 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:   2 | P:   8 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   3 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:   4 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:   4 | P:   3 | LT04 | Found leading comma ','. Expected only trailing near
                       | line breaks. [layout.commas]
L:   4 | P:   4 | LT01 | Expected single whitespace between comma ',' and naked
                       | identifier. [layout.spacing]
L:   4 | P:  16 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   5 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:   6 | P:   5 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   7 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:   8 | P:  11 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   9 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:   9 | P:  42 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  10 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:  10 | P:   3 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  11 | P:   1 | LT02 | Expected indent of 8 spaces.
                       | [layout.indent]
L:  12 | P:  11 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  13 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:  13 | P:  44 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  14 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:  22 | P:   1 | LT02 | Expected indent of 4 spaces.
                       | [layout.indent]
L:  22 | P:   7 | RF02 | Unqualified reference 'order_date' found in select with
                       | more than one referenced table/view.
                       | [references.qualification]
L:  24 | P:   1 | AM06 | Inconsistent column references in 'GROUP BY/ORDER BY'
                       | clauses. [ambiguous.column_references]
L:  24 | P:   9 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
All Finished 📜 🎉!
root@ea02fd7b61c5:/app/dbt_sample/models# 

Execute add as-is without making any changes, and then execute commit. As a result, a different result is returned than when lint is executed in sqlfluff.

(analysis_sample) bash-3.2$ git add models/orders_payments.sql 
(analysis_sample) bash-3.2$ git status
On branch DAS_1639_test_sqlfluff
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   models/orders_payments.sql

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        ../true

(analysis_sample) bash-3.2$ git commit -m 'test'
dbt_sample/models/orders_payments.sql:1: trailing whitespace.
+SELECT 
dbt_sample/models/orders_payments.sql:2: trailing whitespace.
+  c.id, 
dbt_sample/models/orders_payments.sql:4: trailing whitespace.
+  ,c.last_name, 
dbt_sample/models/orders_payments.sql:6: trailing whitespace.
+FROM 
dbt_sample/models/orders_payments.sql:8: trailing whitespace.
+INNER JOIN 
dbt_sample/models/orders_payments.sql:9: trailing whitespace.
+  {{ source('main', 'raw_orders') }} AS o 
dbt_sample/models/orders_payments.sql:10: trailing whitespace.
+ON 
dbt_sample/models/orders_payments.sql:12: trailing whitespace.
+INNER JOIN 
dbt_sample/models/orders_payments.sql:13: trailing whitespace.
+  {{ source('main', 'raw_payments') }} AS p 
dbt_sample/models/orders_payments.sql:17: trailing whitespace.
+WHERE 
dbt_sample/models/orders_payments.sql:24: trailing whitespace.
+GROUP BY 
(analysis_sample) bash-3.2$ 

How to reproduce

  1. Create the model (at this time, I dare you to catch the error you are setting in .sqlfluff)
  2. Create .sqlfluff and .pre-commit-config.yaml files
  3. Run lint using .sqlfluff
  4. Execute add and commit without making any changes

Additional Verification

Change the contents of .sqlfluff to simple rules

[sqlfluff]
templater = dbt
max_line_length = 100
; rule = core
dialect = bigquery
rules = LT01

[sqlfluff:templater:dbt]
project_dir = .
profiles_dir = .
profile = dbt_sample
target = dev

[sqlfluff:indentation]
# dbt Style Guide
tab_space_size = 4
indent_unit = space
indented_using_on = True

[sqlfluff:layout:type:comma]
line_position = trailing

[sqlfluff:rules:capitalisation.keywords]
# GitLab SQL Style Guide
group = core
capitalisation_policy = upper

[sqlfluff:rules:capitalisation.identifiers]
# dbt Style Guide
group = core
capitalisation_policy = lower
unquoted_identifiers_policy = column_aliases

[sqlfluff:rules:layout.long_lines]
group = core
ignore_comment_lines = True

[sqlfluff:rules:layout.cte_newline]
# dbt Style Guide
group = core
comma_style = trailing

[sqlfluff:rules:capitalisation.functions]
# GitLab SQL Style Guide
group = core
capitalisation_policy = upper

[sqlfluff:rules:capitalisation.literals]
# GitLab SQL Style Guide
group = core
capitalisation_policy = upper

[sqlfluff:rules:ambiguous.column_references]
# dbt Style Guide
group = core
group_by_and_order_by_style = implicit

[sqlfluff:rules:references.special_chars]
# GitLab SQL Style Guide
group = core
allow_space_in_identifier = True
additional_allowed_characters = ['.','(',')','-']

Confirmation that .sqlfluff and pre-commit picked up the same error

(analysis_sample) bash-3.2$ sqlfluff lint models/orders_payments.sql --config .sqlfluff
=== [dbt templater] Sorting Nodes...
=== [dbt templater] Compiling dbt project...
=== [dbt templater] Project Compiled.
== [models/orders_payments.sql] FAIL                                                                 
L:   1 | P:   7 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   2 | P:   8 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   4 | P:   4 | LT01 | Expected single whitespace between comma ',' and naked
                       | identifier. [layout.spacing]
L:   4 | P:  16 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   6 | P:   5 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   8 | P:  11 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   9 | P:  42 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  10 | P:   3 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  12 | P:  11 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  13 | P:  44 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  24 | P:   9 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
All Finished !

[INFO] Restored changes from /Users/sugato/.cache/pre-commit/patch1693214541-37417.

(analysis_sample) bash-3.2$ pre-commit run sqlfluff-lint
sqlfluff-lint............................................................Failed
- hook id: sqlfluff-lint
- exit code: 1

WARNING    Attempt to set templater to dbt failed. Using jinja templater. Templater cannot be set in a .sqlfluff file in a subdirectory of the current working directory. It can be set in a .sqlfluff in the current working directory. See Nesting section of the docs for more details. 
== [dbt_sample/models/orders_payments.sql] FAIL
L:   1 | P:   7 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   2 | P:   8 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   4 | P:   4 | LT01 | Expected single whitespace between comma ',' and naked
                       | identifier. [layout.spacing]
L:   4 | P:  16 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   6 | P:   5 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   8 | P:  11 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   9 | P:  42 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  10 | P:   3 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  12 | P:  11 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  13 | P:  44 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  24 | P:   9 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
All Finished !

(analysis_sample) bash-3.2$ 

Adapting fixes in fixes based on the contents of .sqlfluff

(analysis_sample) bash-3.2$ sqlfluff fix models/orders_payments.sql --config .sqlfluff
==== finding fixable violations ====
=== [dbt templater] Sorting Nodes...
=== [dbt templater] Compiling dbt project...
=== [dbt templater] Project Compiled.
== [models/orders_payments.sql] FAIL                                                                 
L:   1 | P:   7 | LT01 | Unnecessary trailing whitespace.                                            
                       | [layout.spacing]
L:   2 | P:   8 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   4 | P:   4 | LT01 | Expected single whitespace between comma ',' and naked
                       | identifier. [layout.spacing]
L:   4 | P:  16 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   6 | P:   5 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   8 | P:  11 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:   9 | P:  42 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  10 | P:   3 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  12 | P:  11 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  13 | P:  44 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
L:  24 | P:   9 | LT01 | Unnecessary trailing whitespace.
                       | [layout.spacing]
==== fixing violations ====
11 fixable linting violations found
Are you sure you wish to attempt to fix these? [Y/n] ...
Attempting fixes...
Persisting Changes...
== [models/orders_payments.sql] FIXED
Done. Please check your files to confirm.
All Finished !

(analysis_sample) bash-3.2$ sqlfluff lint models/orders_payments.sql --config .sqlfluff
=== [dbt templater] Sorting Nodes...
=== [dbt templater] Compiling dbt project...
=== [dbt templater] Project Compiled.
All Finished !                                                                                  
(analysis_sample) bash-3.2$ 

(analysis_sample) bash-3.2$ sqlfluff fix models/orders_payments.sql --config .sqlfluff
==== finding fixable violations ====
=== [dbt templater] Sorting Nodes...
=== [dbt templater] Compiling dbt project...
=== [dbt templater] Project Compiled.
==== no fixable linting violations found ====                                                        
All Finished !
(analysis_sample) bash-3.2$ 

No particular problem with pass when pre-commit run is executed.

(analysis_sample) bash-3.2$ git add models/orders_payments.sql 
(analysis_sample) bash-3.2$ pre-commit run sqlfluff-lint
sqlfluff-lint............................................................Passed
(analysis_sample) bash-3.2$ pre-commit run sqlfluff-fix
sqlfluff-fix.............................................................Passed
(analysis_sample) bash-3.2$ 

The "trailing whitespace." error that was occurring on line 17 was not picked up and an error was returned.

(analysis_sample) bash-3.2$ git commit -m 'test'
dbt_sample/models/orders_payments.sql:17: trailing whitespace.
+WHERE 
(analysis_sample) bash-3.2$ 

Dialect

BigQuery

Version

dbt: 1.6.0 sqlfluff-templater-dbt: 2.3.0 sqlfluff: 2.3.0 python: 3.10

root@ea02fd7b61c5:/app/dbt_sample# dbt --version
Core:
  - installed: 1.6.0
  - latest:    1.6.1 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - duckdb:   1.6.0 - Up to date!
  - bigquery: 1.6.0 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

root@ea02fd7b61c5:/app/dbt_sample# sqlfluff --version
sqlfluff, version 2.3.0
root@ea02fd7b61c5:/app/dbt_sample# sqlfluff-templater-dbt --version
bash: sqlfluff-templater-dbt: command not found
root@ea02fd7b61c5:/app/dbt_sample# cat ../                         
Pipfile       Pipfile.lock  dbt_sample/   
root@ea02fd7b61c5:/app/dbt_sample# cat ../Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

# パッケージ追加 or バージョン変更時は、pipenv updateを実行ください
[packages]
dbt-osmosis = "==0.11.9"
dbt-bigquery = "==1.6.0"
duckdb = "==0.8.1"
dbt-duckdb = "==1.6.0"
colorama = "==0.4.6"
sqlfluff = "==2.3.0"
sqlfluff-templater-dbt = "==2.3.0"
pre-commit = "==3.3.3"

[dev-packages]

[requires]
python_version = "==3.10"
root@ea02fd7b61c5:/app/dbt_sample# 

Configuration

.sqlfluff

[sqlfluff]
templater = dbt
max_line_length = 100
rule = core
dialect = bigquery
; rules = LT01

[sqlfluff:templater:dbt]
project_dir = .
profiles_dir = .
profile = dbt_sample
target = dev

[sqlfluff:indentation]
# dbt Style Guide
tab_space_size = 4
indent_unit = space
indented_using_on = True

[sqlfluff:layout:type:comma]
line_position = trailing

[sqlfluff:rules:capitalisation.keywords]
# GitLab SQL Style Guide
group = core
capitalisation_policy = upper

[sqlfluff:rules:capitalisation.identifiers]
# dbt Style Guide
group = core
capitalisation_policy = lower
unquoted_identifiers_policy = column_aliases

[sqlfluff:rules:layout.long_lines]
group = core
ignore_comment_lines = True

[sqlfluff:rules:layout.cte_newline]
# dbt Style Guide
group = core
comma_style = trailing

[sqlfluff:rules:capitalisation.functions]
# GitLab SQL Style Guide
group = core
capitalisation_policy = upper

[sqlfluff:rules:capitalisation.literals]
# GitLab SQL Style Guide
group = core
capitalisation_policy = upper

[sqlfluff:rules:ambiguous.column_references]
# dbt Style Guide
group = core
group_by_and_order_by_style = implicit

[sqlfluff:rules:references.special_chars]
# GitLab SQL Style Guide
group = core
allow_space_in_identifier = True
additional_allowed_characters = ['.','(',')','-']

.sqlfluffignore

dbt_packages/
macros/

Are you willing to work on and submit a PR to address the issue?

Code of Conduct

kimfrie commented 1 year ago

I also have the same issue, When running sqlfluff lint from CLI, I get all the warnings. However during pre-commit, the hook passes.

toshifumisuga commented 6 months ago

At that time, I didn't perform the validation specifying .sqlfluff in the argument. However, when I tried it in a different environment, I received an error saying the path could not be found. If I change it to an absolute path, then I get a different error...

Reference description method

repos:
- repo: https://github.com/sqlfluff/sqlfluff
  rev: 3.0.2
  hooks:
    - id: sqlfluff-lint
      additional_dependencies : ['dbt-bigquery==1.7.3', 'sqlfluff-templater-dbt==3.0.2']
      log_file : 'true'
      args: [--config, ".sqlfluff"] 
root@6c325701b789:~/workspace/jobs/dbt_project# pre-commit run --files models/staging/dwh/XXXX.sql 
sqlfluff-lint............................................................Failed
- hook id: sqlfluff-lint
- exit code: 2

Error loading config: Extra config '.sqlfluff' does not exist.

root@6c325701b789:~/workspace/jobs/dbt_project# 
toshifumisuga commented 6 months ago

Additional verification details I was able to solve the problem by using git diff on GitHubActions side, so I implemented it there and achieved something close to what I wanted to do. I'll leave this as a note in case anyone else comes across this page.

Make minor modifications depending on your execution environment and repository configuration.

    - name: Execute sqlfluff
      shell: bash
      working-directory: ${{ inputs.POETRY_LOCK_DIR }}/dbt_project
      run: |
        git fetch origin develop
        MODIFIED_SQL_FILES=$(git diff origin/develop --name-only --diff-filter=ACMRT -- '*.sql' | sed 's|^jobs/dbt_project/||')
        if [[ -z $MODIFIED_SQL_FILES ]]; then
          echo "No changes in the files. Exiting with success."
          exit 0
        else
          echo "Linting files"
          poetry run sqlfluff lint $MODIFIED_SQL_FILES
        fi