sqlfluff / vscode-sqlfluff

An extension to use the sqlfluff linter in vscode.
https://marketplace.visualstudio.com/items?itemName=dorzey.vscode-sqlfluff
MIT License
154 stars 33 forks source link

Use extension in multiple dbt projects inside a single devcontainer #83

Open ggmblr opened 1 year ago

ggmblr commented 1 year ago

Hi, I'm trying to set up a devcontainer in vscode to allow for a reproducible dev environment for our data team. Since we work on more than a single project, the root directory (or workspace) i'm opening contains a number of projects, some of which are dbt projects

See an example of our workspace:

.
|____.devcontainer
|____.vscode
|____airflow-project
|____airflow-project-2
|____infrastructure-project
|____dbt-image
|____dbt-snowflake-1
|____dbt-snowflake-2
|____dbt-snowflake-3
|____dbt-snowflake-4
|____devcontainers
|____python-project

In each of the dbt-snowflake projects we have a .sqlfluff config file that looks more or less like this:

[sqlfluff]
dialect = snowflake
templater = dbt
exclude_rules = L001, L034

[sqlfluff:templater:dbt]
project_dir = ./

[sqlfluff:rules]
max_line_length = 999
tab_space_size = 4
indent_unit = space

[sqlfluff:rules:L010]
# Keywords
capitalisation_policy = lower

[sqlfluff:rules:L030]
# Function names
capitalisation_policy = lower

[sqlfluff:rules:L040]
# Null & Boolean Literals
capitalisation_policy = lower

[sqlfluff:rules:L057]
# Special Characters in Identifiers
allow_space_in_identifier = True
additional_allowed_characters = ['.','(',')','-']

This ensures that in a vacuum sqlfluff works inside each project, but we only get the vscode extension to work on a single project, which is the one for which we configure the "sqlfluff.config"

In the devcontainer.json we have the following config for sqlfluff (as we see here we configure the config path to be the one inside project 1):

        "sqlfluff.config": "${workspaceFolder}/dbt-snowflake-1/.sqlfluff",
        "sqlfluff.dialect": "snowflake",
        "sqlfluff.executablePath": "/usr/local/bin/sqlfluff",
        "sqlfluff.format.enabled": true,
        "sqlfluff.format.workingDirectory": "",
        "sqlfluff.ignoreLocalConfig": true,
        "sqlfluff.ignoreParsing": false,
        "sqlfluff.linter.run": "onSave",
        "sqlfluff.experimental.format.executeInTerminal": true,
        "editor.formatOnSave": false,
        "[sql]": {
            "editor.defaultFormatter": "dorzey.vscode-sqlfluff"
        }

Now i have a couple of questions:

Is it possible for us to get the extension to work on all projects inside a single devcontainer? Meaning, we want to be able to work on a .sql file inside whichever dbt project and get it linted on save.

On the other hand what is the expected performance for the linting? I feel like it consumes a lot of resources and takes a considerable amount of time, probably because or projects are large and the compilation takes a lot of time. Do subsequent lintings use partial parsing?

Feel free to request more details or give us tips on best practices or how to properly configure such a dev environment/container.

Many thanks.

RobertOstermann commented 1 year ago

I think it would be possible to get the extension to work on all projects, but I have not tested this before so I am not sure. Your best bet would be to remove the sqlfluff.config option as you are needing multiple different configs and I don't think there is an easy way for you to look into each folder for the configuration file. I would suggest setting "sqlfluff.workingDirectory" = "${fileDirname}". This sets the directory the sqlfluff command is run from and sqlfluff should recognize those configuration files automatically.

The linting performance is not too good, especially when linting with the dbt templater. I have not done any work in the sqlfluff repository, I just work on this extension. You would have to look at that repository for any performance concerns. This extension basically just runs the sqlfluff command and shows that output in VSCode.

Let me know if you are able to get your project working with my suggestions.

ggmblr commented 1 year ago

Hi @RobertOstermann , i tried your config proposal:


        //"sqlfluff.config": "${workspaceFolder}/dbt-snowflake/.sqlfluff",
        "sqlfluff.workingDirectory": "${fileDirname}",
        "sqlfluff.dialect": "snowflake",
        "sqlfluff.executablePath": "/usr/local/bin/sqlfluff",
        "sqlfluff.format.enabled": true,
        "sqlfluff.format.workingDirectory": "",
        "sqlfluff.ignoreLocalConfig": true,
        "sqlfluff.ignoreParsing": false,
        "sqlfluff.linter.run": "onSave",
        "sqlfluff.experimental.format.executeInTerminal": true,
        "editor.formatOnSave": false,
        "[sql]": {
            "editor.defaultFormatter": "dorzey.vscode-sqlfluff"

But it failed with the following error (taken out of the output terminal for sqlfluff):

------------------------------------------------------------

Reading from file, not stdin

--------------------Executing Command--------------------

/usr/local/bin/sqlfluff lint --format json --dialect snowflake --ignore-local-config ../../../../../../dbt/dbt-snowflake/models/marts/analytics/dim_class.sql

------------------------------------------------------------

Child process threw error
Error: spawn /usr/local/bin/sqlfluff ENOENT
Linting command failed to execute

Any idea how to solve this?

RobertOstermann commented 1 year ago

@ggmblr I think this error occurs whenever the program is unable to find the sqlfluff executable. Though I also would not have expected that file path to be going up directories to find the file.

You should try running the /usr/local/bin/sqlfluff lint --format json --dialect snowflake --ignore-local-config ../../../../../../dbt/dbt-snowflake/models/marts/analytics/dim_class.sql command in the directory that dim_class.sql is located. Essentially this extension just runs that commands in that directory and parses the output to determine where the errors need to be shown in VSCode.

It is possible that you just need to update your sqlfluff.executablePath to match the command you would call to run sqlfluff in the terminal. But if that command works in the terminal then that is not the case.

Did you have this extension working before updating the sqlfluff.workingDirectory?

ggmblr commented 1 year ago

@RobertOstermann I tried running the command in various directories manually and it didn't work. The error message was always

[{"filepath": "../../../../../../dbt/dbt-snowflake/models/marts/analytics/dim_class.sql", "violations": [{"line_no": 1, "line_pos": 1, "code": "TMP", "description": "Unrecoverable failure in Jinja templating: <sqlfluff.core.templaters.jinja.JinjaTemplater.process.<locals>.UndefinedRecorder object at 0x7ffac1ab1c10> is not safely callable. Have you configured your variables? https://docs.sqlfluff.com/en/latest/configuration.html"}]}]

So i tried removing the sqlfluff.workingDirectory . The executed command and error on save is the following:

--------------------Executing Command--------------------

/usr/local/bin/sqlfluff lint --format json --dialect snowflake --ignore-local-config dbt-snowflake/models/marts/analytics/dim_class.sql

------------------------------------------------------------

Received close event, code 1 signal null
Raw stdout output:

------------------------------------------------------------

[{"filepath": "dbt-snowflake/models/marts/analytics/dim_class.sql", "violations": [{"line_no": 1, "line_pos": 1, "code": "TMP", "description": "Unrecoverable failure in Jinja templating: <sqlfluff.core.templaters.jinja.JinjaTemplater.process.<locals>.UndefinedRecorder object at 0x7f7ecd9674c0> is not safely callable. Have you configured your variables? https://docs.sqlfluff.com/en/latest/configuration.html"}]}]

------------------------------------------------------------

Running the execution command inside the terminal either yields the same error message when run in a parent directory, or in this error, when running it in the same directory.

Screenshot 2023-01-24 at 09 27 01

With my previous config (which didn't include the sqlfluff.workingDirectory, but included "sqlfluff.config": "${workspaceFolder}/dbt-snowflake/.sqlfluff" was working fine at least for that specific directory. See here:

--------------------Executing Command--------------------

/usr/local/bin/sqlfluff lint --format json --config /dbt/dbt-snowflake/.sqlfluff --dialect snowflake --ignore-local-config dbt-snowflake/models/marts/analytics/dim_class.sql

------------------------------------------------------------

Received close event, code 1 signal null
Raw stdout output:

------------------------------------------------------------

[{"filepath": "dbt-snowflake/models/marts/analytics/dim_class.sql", "violations": [{"line_no": 27, "line_pos": 1, "code": "L010", "description": "Keywords must be lower case."}]}]

------------------------------------------------------------

My next test is to combine both configs, have sqlfluff.workingDirectory and "sqlfluff.config": "${workspaceFolder}/dbt-snowflake/.sqlfluff". Here it fails telling me in a pop up error

The sqlfluff executable was not found. Use the 'Executable Path' setting to configure the location of the executable, or add it to your PATH.

While the output is the following:

--------------------Executing Command--------------------

/usr/local/bin/sqlfluff lint --format json --config /dbt/dbt-snowflake/.sqlfluff --dialect snowflake --ignore-local-config ../../../../../../dbt/dbt-snowflake/models/marts/analytics/dim_class.sql

------------------------------------------------------------

Child process threw error
Error: spawn /usr/local/bin/sqlfluff ENOENT
Linting command failed to execute

So what you mention about sqlfluff.executablePath seems to be a good starting point. Which path would i need to provide? /usr/local/bin/sqlfluff?

Thanks again for your help.

RobertOstermann commented 1 year ago

@ggmblr Looking at it again I don't think it is an issue with your executable path. I think it was an issue with the variables not working correctly in the workingDirectory setting. Could you upgrade to v2.2.3 and try these settings. Let me know if this works for you.

//"sqlfluff.config": "${workspaceFolder}/dbt-snowflake/.sqlfluff",
"sqlfluff.workingDirectory": "${fileDirname}",
"sqlfluff.dialect": "snowflake",
"sqlfluff.executablePath": "/usr/local/bin/sqlfluff",
"sqlfluff.format.enabled": true,
// I think you need this to be false to pick up your configuration files.
"sqlfluff.ignoreLocalConfig": false,
"sqlfluff.ignoreParsing": false,
"sqlfluff.linter.run": "onSave",
"sqlfluff.experimental.format.executeInTerminal": true,
"editor.formatOnSave": false,
"[sql]": {
    "editor.defaultFormatter": "dorzey.vscode-sqlfluff"
}
ggmblr commented 1 year ago

@RobertOstermann Using the suggested config with v2.2.3 results in the following output:

------------------------------------------------------------

Reading from file, not stdin

--------------------Executing Command--------------------

/usr/local/bin/sqlfluff lint --format json --dialect snowflake dim_class.sql

------------------------------------------------------------

Received close event, code 1 signal null
Raw stdout output:

------------------------------------------------------------

[{"filepath": "dim_class.sql", "violations": [{"line_no": 1, "line_pos": 1, "code": "TMP", "description": "Unrecoverable failure in Jinja templating: <sqlfluff.core.templaters.jinja.JinjaTemplater.process.<locals>.UndefinedRecorder object at 0x7fceee89e9a0> is not safely callable. Have you configured your variables? https://docs.sqlfluff.com/en/latest/configuration.html"}]}]

------------------------------------------------------------

Which suggests that it is not able to read my config for sqlfluff, which is set with

templater = dbt

RobertOstermann commented 1 year ago

Ok, I am about out of ideas. One last thing you could try would be removing the sqlfluff.workingDirectory configuration from the last setup I gave you and hope that SQLFluff can find the correct configuration files on its own. You can look at this page to see how SQLFluff finds configuration files - Configuration.

"sqlfluff.dialect": "snowflake",
"sqlfluff.executablePath": "/usr/local/bin/sqlfluff",
"sqlfluff.format.enabled": true,
"sqlfluff.ignoreLocalConfig": false,
"sqlfluff.ignoreParsing": false,
"sqlfluff.linter.run": "onSave",
"sqlfluff.experimental.format.executeInTerminal": true,
"editor.formatOnSave": false,
"[sql]": {
    "editor.defaultFormatter": "dorzey.vscode-sqlfluff"
}

If that doesn't work you would have to find some consistent way of calling SQLFluff from the terminal that works for your different cases. This extension is fairly configurable so if you are able to find something that works I don't think it would be too difficult to set up with the available settings.