anboralabs / sqlfluff-intellij-community

Sqlfluff plugin integration with intellij
GNU General Public License v3.0
0 stars 1 forks source link

Error when running with dbt #32

Closed JoakimNil closed 5 months ago

JoakimNil commented 1 year ago

When using the context action "sqlfluff fix" on a file in a dbt project, I get the following error:

Executing 'sqlfluff fix'
C:\Users\joakim\misc_programming\dbt_venv\Scripts\python.exe C:\Users\joakim\misc_programming\dbt_venv\Scripts\sqlfluff.exe fix C:\Users\joakim\misc_programming\dbt_recharge\models\utilities\time_dim.sql --force --templater dbt
==== finding fixable violations ====
FORCE MODE: Attempting fixes...
=== [dbt templater] Sorting Nodes...

User Error: DbtProjectError: Runtime Error
  No dbt_project.yml found at expected path C:\Users\joakim\AppData\Local\JetBrains\Toolbox\apps\PyCharm-P\ch-0\232.8660.197\jbr\bin\dbt_project.yml
  Verify that each entry within packages.yml (and their transitive dependencies) contains a file named dbt_project.yml

This happens both when running with sqlfluff global and with sqlfluff manual. In both cases i use the arguments --templater dbt in the settings. Without it, a different error will be output.

To me it looks like this issue might stem from sqlfluff being ran from the PyCharm install location as working dir, rather than the project dir.

dalgarins commented 1 year ago

@JoakimNil could you share a dbt file example?

JoakimNil commented 1 year ago

The error occurs regardless of file contents, but here are some sample content in time_dim.sql which reproduces the error:

select row_number() over (order by temp_sequence) as time_of_day
from table(generator(rowcount => 1440))

By the way, I'm using Sqlfluff Linter (Ultimate Edition) v2023-4-7 with PyCharm (Professional Edition) v2023.2.

I should add that when I don't use any additional arguments in sqlfluff settings, I get the following error both for the "use sqlfluff global" and "use sqlfluff manual" settings:

Executing 'sqlfluff fix'
sqlfluff fix C:\Users\joakim\misc_programming\dbt_recharge\models\utilities\time_dim.sql --force --dialect ansi
==== finding fixable violations ====
FORCE MODE: Attempting fixes...
WARNING    Attempt to set templater to dbt failed. Using jinja templater. Templater cannot be set in a .sqlfluff file in a subdirectory of the current working directory. It can be set in a .sqlfluff in the current working directory. See Nesting section of the docs for more details. 
==== no fixable linting violations found ====
All Finished!

It looks like this error is referring to the .sqlfluff file I have in the project dir, that contains the following initial config:

[sqlfluff]
dialect = snowflake
templater = dbt
runaway_limit = 10
max_line_length = 200
indent_unit = space
sql_file_exts = .sql
JoakimNil commented 1 year ago

I made it work in this fork: https://github.com/JoakimNil/sqlfluff-jetbrains-dbt. I'm not familiar enough with this project or jetbrains plugins in general to know whether my changes would create other issues, but if you would like to add dbt support to this plugin, it might be worth checking out.

dalgarins commented 1 year ago

@JoakimNil thank you, I will checkout your changes.

dalgarins commented 1 year ago

@JoakimNil the new release 2023.5.0 for ultimate or 1.4.1 for community versions, will fix this issue, please let me know if this release fix your issue.

JoakimNil commented 1 year ago

image

I got this error. I had the same issue when modifying your plugin. To resolve it, I changed CustomLinter.kt and GlobalLinter.kt to use the original file rather than a temp file. However, this was one of those things I don't really know the consequences of changing.

dalgarins commented 1 year ago

When you modified a file in intellij, the real file is not modified the changed is save on memory, the line of code of copy the file to a temp directory is to detect the real time change and apply sqlfluff over the new file, even with the last update I injected some issues, I decided to use your solution, the new version has the changes, version: 1.4.2

JoakimNil commented 1 year ago

Thank you for adding this.

I noticed what you mention too, and I think the linting in 1.4.2 (and same in my previous fork) is too slow to be very useful for real-time parsing. I updated my fork with a new commit that keeps using temp files in CustomLinter only, and places these temp files in the same folder as the file being parsed. It seems to work much better. It is a drawback that it has to create files in the project folder, but these are removed very fast, so they aren't really noticable.

dalgarins commented 1 year ago

new version 1.4.3 was released.

dalgarins commented 1 year ago

@JoakimNil everything is working?

JoakimNil commented 1 year ago

so far so good - thank you for following up on this. It was interesting to see how you improved on my hacks with a much better implementation.

I was planning on using for a little while before posting here about the results, so that I could provide a better response on how it is working. But I guess I'll just share what I've been seeing and thinking as of now, and then I can report back if I have any issues. So far I have seen one case of the extension creating many tmp files that stick around for longer than they should. So a few things I've been thinking about are:

As you can see, these are mostly minor points that aren't crucial for the extension to work. So I'm very happy with the extension now as-is :) The issue can be closed unless you'd like to follow up further on some of what I've mentioned.

JoakimNil commented 1 year ago

I don't know what changed, but now I'm only able to get the fix command working - no errors are showing in the file itself. When I run Code -> Inspect Code -> A given file with the default inspection profile, which includes the sqlfluff inspection, I also get no errors. This is true both for your published sqlfluff plugin and my fork, even after multiple re-installs. I can see logs when using my fork version, and it shows that sqlfluff is executed and outputs violations, but they don't seem to show up in the file.

I don't know if it's related, but I see this error in the IDE logs:

2023-09-04 11:56:27,810 [  29442] SEVERE - #c.i.c.d.i.ExternalToolPass - IdeaLoggingEvent[message=ExternalToolPass: , throwable=com.intellij.diagnostic.PluginException: annotator: co.anbora.labs.sqlfluff.ultimate.ide.annotator.LinterExternalAnnotator@562ba376 (class co.anbora.labs.sqlfluff.ultimate.ide.annotator.LinterExternalAnnotator) [Plugin: co.anbora.labs.sqlfluff.ultimate]
    at com.intellij.diagnostic.PluginProblemReporterImpl.createPluginExceptionByClass(PluginProblemReporterImpl.java:23)
    at com.intellij.diagnostic.PluginException.createByClass(PluginException.java:89)
    at com.intellij.codeInsight.daemon.impl.ExternalToolPass.processError(ExternalToolPass.java:251)
    at com.intellij.codeInsight.daemon.impl.ExternalToolPass.doAnnotate(ExternalToolPass.java:210)
    at com.intellij.codeInsight.daemon.impl.ExternalToolPass.doAnnotate(ExternalToolPass.java:201)
    at com.intellij.codeInsight.daemon.impl.ExternalToolPass$1.lambda$run$0(ExternalToolPass.java:164)
    at com.intellij.codeInsight.daemon.impl.ExternalToolPass.runChangeAware(ExternalToolPass.java:266)
    at com.intellij.codeInsight.daemon.impl.ExternalToolPass$1.lambda$run$2(ExternalToolPass.java:164)
    at com.intellij.openapi.progress.impl.CoreProgressManager.lambda$runProcess$2(CoreProgressManager.java:186)
    at com.intellij.openapi.progress.impl.CoreProgressManager.lambda$executeProcessUnderProgress$13(CoreProgressManager.java:604)
    at com.intellij.openapi.progress.impl.CoreProgressManager.registerIndicatorAndRun(CoreProgressManager.java:679)
    at com.intellij.openapi.progress.impl.CoreProgressManager.computeUnderProgress(CoreProgressManager.java:635)
    at com.intellij.openapi.progress.impl.CoreProgressManager.executeProcessUnderProgress(CoreProgressManager.java:603)
    at com.intellij.openapi.progress.impl.ProgressManagerImpl.executeProcessUnderProgress(ProgressManagerImpl.java:61)
    at com.intellij.openapi.progress.impl.CoreProgressManager.runProcess(CoreProgressManager.java:173)
    at com.intellij.openapi.progress.util.BackgroundTaskUtil.runUnderDisposeAwareIndicator(BackgroundTaskUtil.java:360)
    at com.intellij.codeInsight.daemon.impl.ExternalToolPass$1.run(ExternalToolPass.java:162)
    at com.intellij.util.ui.update.Update.runUpdate(Update.java:114)
    at com.intellij.util.ui.update.MergingUpdateQueue.execute(MergingUpdateQueue.java:348)
    at com.intellij.util.ui.update.MergingUpdateQueue.execute(MergingUpdateQueue.java:338)
    at com.intellij.util.ui.update.MergingUpdateQueue.doFlush(MergingUpdateQueue.java:295)
    at com.intellij.util.ui.update.MergingUpdateQueue.flush(MergingUpdateQueue.java:277)
    at com.intellij.util.ui.update.MergingUpdateQueue.run(MergingUpdateQueue.java:246)
    at com.intellij.util.concurrency.QueueProcessor.runSafely(QueueProcessor.java:254)
    at com.intellij.util.Alarm$Request.runSafely(Alarm.java:373)
    at com.intellij.util.Alarm$Request.run(Alarm.java:360)
    at com.intellij.util.concurrency.Propagation.contextAwareCallable$lambda$2(propagation.kt:328)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at com.intellij.util.concurrency.SchedulingWrapper$MyScheduledFutureTask.run(SchedulingWrapper.java:272)
    at com.intellij.util.concurrency.BoundedTaskExecutor.doRun(BoundedTaskExecutor.java:249)
    at com.intellij.util.concurrency.BoundedTaskExecutor.access$200(BoundedTaskExecutor.java:31)
    at com.intellij.util.concurrency.BoundedTaskExecutor$1.executeFirstTaskAndHelpQueue(BoundedTaskExecutor.java:227)
    at com.intellij.util.ConcurrencyUtil.runUnderThreadName(ConcurrencyUtil.java:218)
    at com.intellij.util.concurrency.BoundedTaskExecutor$1.run(BoundedTaskExecutor.java:215)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:702)
    at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:699)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:399)
    at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(Executors.java:699)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'None': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
 at [Source: (String)"None"; line: 1, column: 5]
    at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2477)
    at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:760)
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:3041)
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:3019)
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._matchToken2(ReaderBasedJsonParser.java:2809)
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._matchToken(ReaderBasedJsonParser.java:2787)
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:2059)
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:808)
    at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:4912)
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4818)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3772)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3755)
    at co.anbora.labs.sqlfluff.ultimate.lint.issue.IssueMapper.apply(IssueMapper.kt:12)
    at co.anbora.labs.sqlfluff.ultimate.lint.Linter$errors$1.invoke(Linter.kt:136)
    at co.anbora.labs.sqlfluff.ultimate.lint.Linter$errors$1.invoke(Linter.kt:135)
    at kotlin.sequences.FlatteningSequence$iterator$1.ensureItemIterator(Sequences.kt:315)
    at kotlin.sequences.FlatteningSequence$iterator$1.hasNext(Sequences.kt:303)
    at kotlin.sequences.TransformingSequence$iterator$1.hasNext(Sequences.kt:214)
    at kotlin.sequences.FilteringSequence$iterator$1.calcNext(Sequences.kt:169)
    at kotlin.sequences.FilteringSequence$iterator$1.hasNext(Sequences.kt:194)
    at kotlin.sequences.FlatteningSequence$iterator$1.ensureItemIterator(Sequences.kt:311)
    at kotlin.sequences.FlatteningSequence$iterator$1.hasNext(Sequences.kt:303)
    at kotlin.sequences.TransformingSequence$iterator$1.hasNext(Sequences.kt:214)
    at kotlin.sequences.FilteringSequence$iterator$1.calcNext(Sequences.kt:169)
    at kotlin.sequences.FilteringSequence$iterator$1.hasNext(Sequences.kt:194)
    at kotlin.sequences.SequencesKt___SequencesKt.toCollection(_Sequences.kt:787)
    at kotlin.sequences.SequencesKt___SequencesKt.toMutableList(_Sequences.kt:817)
    at kotlin.sequences.SequencesKt___SequencesKt.toList(_Sequences.kt:808)
    at co.anbora.labs.sqlfluff.ultimate.lint.Linter.errors(Linter.kt:147)
    at co.anbora.labs.sqlfluff.ultimate.lint.Linter.runLinter(Linter.kt:125)
    at co.anbora.labs.sqlfluff.ultimate.lint.Linter.lint(Linter.kt:87)
    at co.anbora.labs.sqlfluff.ultimate.lint.LinterConfig$CUSTOM.lint(LinterConfig.kt:29)
    at co.anbora.labs.sqlfluff.ultimate.ide.annotator.LinterExternalAnnotator.doAnnotate(LinterExternalAnnotator.kt:65)
    at co.anbora.labs.sqlfluff.ultimate.ide.annotator.LinterExternalAnnotator.doAnnotate(LinterExternalAnnotator.kt:19)
    at com.intellij.codeInsight.daemon.impl.ExternalToolPass.doAnnotate(ExternalToolPass.java:207)
    ... 37 more
]

Also, I see that the plugin is creating many tmp files that exists at the same time, which suggests that it either isn't deleting them after linting them, or that the linting is restarted again and again before it gets the chance to complete and remove the files. The files are not removed even after several minutes.

dalgarins commented 1 year ago

@JoakimNil share with me the sql file and your config, to validate what is happening.

JoakimNil commented 1 year ago

sure, here's the sql file:

with

source as (

  select * from {{ source('PS', 'F_SN') }}

)

-- C_IMPORT contains less useful data that we need. However, we use it here because
-- F_SN contains invalid transactions, which are excluded from C_IMPORT,
-- which allows us to filter out the invalid transactions by session_id.
, cdr_import as (

  select * from {{ source('PS', 'C_IMPORT') }}

)

, selection as (

  select
    {{ source_and_id_based_key('PS', 'session_id') }} as cdr_pk
    , session_id as cdr_id
    , chargepoint_id as charger_id
    , {{ source_and_id_based_key('PS', 'connector_id') }} as connector_fk
    , connector_id
    , {{ source_and_id_based_key('PS', 'location_id') }} as location_fk
    , location_id , duration_seconds as session_duration_sec
    , {{ convert_seconds_to_time_type('duration_seconds') }} as session_duration_time
    , energy_value / 1000 as power_consumption_kwh
    , final_price_currency as currency_fk
    , final_price_incl_vat - final_price_vat_amount as net_revenue
    , final_price_vat_amount as tax
    , started_at as session_start_dt
    , to_char(session_start_dt, 'yyyymmdd') as start_dt_fk
    , stop_request_at as stop_received_dt
    , stopped_at as session_stop_dt
    , to_char(session_stop_dt, 'yyyymmdd') as stop_dt_fk
    , 'PS' as source_system
  from source
  where session_id in (select session_id from cdr_import)

)

select * from selection

.sqlfluff:

[sqlfluff]
dialect = snowflake
templater = dbt
runaway_limit = 10
max_line_length = 200
indent_unit = space
sql_file_exts = .sql

# ST06 (structure.column_order) is disabled because it's a controversial rule
# that will flag things such as coalesce(dz_id, ps_id, sf_id) as id as an error
# if it is included as the first column.
exclude_rules = structure.column_order

[sqlfluff:indentation]
tab_space_size = 2

[sqlfluff:layout:type:comma]
# This project follows the leading comma convention.
spacing_before = touch
line_position = leading

[sqlfluff:rules:capitalisation.keywords]
capitalisation_policy = lower

[sqlfluff:rules:aliasing.table]
aliasing = explicit

[sqlfluff:rules:aliasing.column]
aliasing = explicit

[sqlfluff:rules:aliasing.expression]
allow_scalar = False

[sqlfluff:rules:capitalisation.identifiers]
extended_capitalisation_policy = lower

[sqlfluff:rules:capitalisation.functions]
extended_capitalisation_policy = lower

[sqlfluff:rules:references.special_chars]
# The special char rules also runs on sources, therefore certain refrences have to be ignored.
ignore_words_regex = s.hk.hybridin luokka
ignore_words = DATA VALUE

In plugin settings, I use sqlfluff manual, pointing to a python venv with dbt-snowflake v1.6.0 and sqlfluff-templater-dbt v2.3.0 installed. Arguments for both lint and fix: --templater dbt --dialect snowflake.

I just realized that linting does work in other files, such as this one:

with country as (

  select * from {{ ref('country_codes') }}

)select * from country

however, if I switch to the first one and get the exception, then linting stops working even for the second file.

dalgarins commented 1 year ago

@JoakimNil this is what I found for your config file and your scripts:

Error Model 'model.test_dbt_project.joakimnil' (models/example/joakimnil.sql) depends on a source named 'PS.F_SN' which was not found

I have added notifications for those errors in the next version.

Also next version will hide all temp sqlfluff files and IDE won't show you those files.

dalgarins commented 1 year ago

@JoakimNil also it's weird all the logs that you attached, I will recommend you to uninstall both plugins and try to install from marketplace, I don't know if something between those two is causing this crash.

dalgarins commented 1 year ago

@JoakimNil was the issue fixed to you?

JoakimNil commented 1 year ago

The depends on a source named 'PS.F_SN' which was not found error you are getting is because you're missing the source in the dbt project, which isn't the case for me. I created a minimal reproducible example here which you can use to reproduce the issue. Just clone the repo, install requirements.txt, and open and edit the joakimnil.sql file, and you will see it.

I'm still seeing the issue, even though the ultimate edition from marketplace is the only version I've installed.

JoakimNil commented 1 year ago

I just updated to sqlfluff 2023.6.0, and now the temp files are not shown in the IDE. Very nice :) However, I see from file explorer that the issue with the mentioned sql file is still present: the temp files are still generated, and linting fails.

I also ran into another issue: I get this error message when opening and using the database console in pycharm: image

There are two issues here: first, this notification is quite annoying, as it's repeated very often. Second, sqlfluff fails to lint the database console, as it can't work with the dbt templater since the console file isn't part of the dbt project.

The second issue doesn't really apply to me, since I'd never want to lint my database console. But I guess someone else might want that. To get that to work, the plugin would somehow need to override (ignore) the user's "--templater dbt" argument in sqlfluff settings for the console file only (or even better: for all non-project files).

JoakimNil commented 11 months ago

@dalgarins I wanted to give you an update. I tried the current version for a while, but eventually concluded that it doesn't work well enough with temp files, and I don't think it ever will. I therefore created some changes which I think is a big improvement: https://github.com/JoakimNil/sqlfluff-fork-2/tree/notempfile

To address this, I created a version that uses a different approach: it saves the latest changes to the original file before running the sqlfluff command, and runs the sqlfluff command on it rather than a temp file. This approach of course solves all the problems related to the use of temp files. However, it does come with some drawbacks:

First, the error highlighting in linted files disappears too quickly: it appears that saving the file causes it to disappear. This creates a workflow where I:

  1. Wait for the error markings
  2. Look over the file to try and remember them all
  3. Start fixing them, as they all disappear
  4. Repeat from step 1 until all errors are fixed

Perhaps this could be improved somewhat by the plugin, if it applies some clever rules on when and how often to resave and rerun the linting.

The second problem is that global code inspection doesn't seem to work. One example of this is that when I commit code, the IDE seems to run linting on all modified files, as it should, but even though sqlfluff linting takes some time to complete, the IDE never seem to actually report any errors found by sqlfluff.

dalgarins commented 11 months ago

@JoakimNil thanks for your code, but I think a better fix for people could be only for dbt execute the linter when the file is saved to avoid temp files creation.

JoakimNil commented 11 months ago

@dalgarins I'm not sure I understand you correctly, could you elaborate?

dalgarins commented 11 months ago

@JoakimNil sure, currently the linter is being execute when user modified the file but that modification is in memory the reason for temp files is that, the changes are in memory I need the changes in a file to execute the linter, my solution was create temp files, a solution for me is add a settings to decide if execute the linter when the file is modified or when the file is saved, with that option people who don't want temp files in their project won't those files and if someone prefer the "realtime behavior" they will have the temp files.

let me know if it is clear.

JoakimNil commented 10 months ago

I see. I would have to try it to be sure, but I think your solution of running only on save would indeed be better than my latest solution for dbt. Is this something you would be willing to add to the plugin?

dalgarins commented 10 months ago

@JoakimNil yes, I'm going to add that option to avoid many issues related with those temp files.

dalgarins commented 5 months ago

@JoakimNil please update to latest plugin version, let me know if your issue is fixed

JoakimNil commented 5 months ago

based on trying it for a few minutes it seems to work perfectly 🥳 thank you very much for your effort on this, it really makes my work easier!

If you don't mind, I suggest that I test it for another week or two just to make sure I don't run into any issues.

dalgarins commented 5 months ago

Ok, please let me know any issues you find

JoakimNil commented 5 months ago

The extension now works great!

There's potential for a few quality of life improvements:

dalgarins commented 5 months ago

@JoakimNil Great ideas, do you think we could close this issue? and create a new one for these ideas.

JoakimNil commented 5 months ago

definitely :) Should I or you create the new issue(s)?

dalgarins commented 5 months ago

Please, create the new issue @JoakimNil and thanks a lot.

JoakimNil commented 5 months ago

New issues created. #60 and #61. Thank you so much for your help with this. I bought you a few coffees for your efforts.