RobertOstermann / vscode-sqlfluff-extended

An extension to use the sqlfluff linter in vscode.
https://marketplace.visualstudio.com/items?itemName=dorzey.vscode-sqlfluff
MIT License
6 stars 1 forks source link

mojibake (text garbling) may occur when execute fix #3

Open yamap55 opened 2 years ago

yamap55 commented 2 years ago

mojibake occurs when fixing the following file. "sqlfluff" directly does not cause this, so I believe it has something to do with the extension.

sqlfluff command

sqlfluff fix --dialect bigquery --config .sqlfluff work/a.sql

This command does not cause mojibake. ("force" would have done the same.)

target sql file

-- 割引金額
select a from b;

after fixed

-- 割引��額
SELECT a FROM b;

.sqlfluff

[sqlfluff]
exclude_rules = L032, L016

[sqlfluff:rules:L010]
capitalisation_policy = upper

[sqlfluff:rules:L014]
extended_capitalisation_policy = lower
ignore_words = _PARTITIONTIME, _PARTITIONDATE

[sqlfluff:rules:L019]
comma_style = leading

[sqlfluff:rules:L026]
force_enable = False

[sqlfluff:rules:L057]
quoted_identifiers_policy = none

settings.json

    "sqlfluff.dialect": "bigquery",
    "sqlfluff.config": ".sqlfluff",
    "sqlfluff.linter.run": "onSave",

Reference link

https://en.wikipedia.org/wiki/Mojibake

RobertOstermann commented 2 years ago

Hmm, yeah that is probably an encoding issue. I have to run the sqlfluff fix command and pass the output into my extension and then re-write the current file with that output. So either the reading or writing, or maybe both, is a bit broken. Do you have any idea if this happened with the original extension? I think that extension just ran the command and allowed it to directly update the file, but I changed it because I ran into the problem where the file was always showing as unsaved. I will take a closer look at this, but it might take me a week or so until I have the time to do so.

RobertOstermann commented 2 years ago

@yamap55 Ok, I had some trouble reproducing the issue. But I think in general the way I was doing formatting was not good so I decided to re-write the formatting provider in 30fbfac. Hopefully that also solves the Mojibake problems you were seeing. Give version 0.1.8 a try and let me know if it works and if it solves your problem.

yamap55 commented 2 years ago

Thank you for fixing the code. Problem solved.

However, the code modification is very slow. And sometimes a dialog box appears saying that a conflict has occurred. I think the problem is caused by this fix, but the issue itself seems to be different, so should I create another Issue? image

RobertOstermann commented 2 years ago

Hmm, the code formatting should take the same amount of time as it previously did. Though that can take quite long for larger files, that is a problem with sqlfluff itself and not my extension. My recent changes basically made all the fixes be completed by the terminal instead of me reading the output.

As for that dialog box, it is caused whenever you make changes while the fixes are still being made. This is an unintended side-effect of the changes I made, but I haven't been able to figure out a good solution to it.

I decided to revert back to the previous formatting provider and attempt to use "utf-8" encoding on the spawn process. So these previous problems with dialog box should not happen anymore, instead the formatting will just get cancelled if it is not yet complete. But your original Mojibake problem might be occurring again. Can you check version 0.1.9 and let me know if that is still occurring?

yamap55 commented 2 years ago

MOJIBAKE occurred in v0.1.9.

So these previous problems with dialog box should not happen anymore, instead the formatting will just get cancelled if it is not yet complete.

I was unaware that in versions prior to v0.1.7, the reason the fix was not completed was that it was cancelled behind the scenes. I thought it was more explicitly pointed out in the dialog. Therefore, I think v0.1.8 is better.

Thank you for your consideration! You have been very helpful to us.

RobertOstermann commented 2 years ago

Hmm, I had hoped explicitly using utf-8 encoding might have solved the mojibake problem. Do you know what encoding you are using for your file? I'm just curious if it is encoded in utf-8 or something different.

Unfortunately, VSCode does not provide me with the encoding of the open file, so I can't match the encoding in the spawned process and if the file is not utf-8 then that might be the problem. If the file is encoded in utf-8 then I am not sure why mojibake is happening or how to fix it.

I do agree that it is unclear that formatting gets cancelled if the file is changed and v0.1.8 made that a bit more clear. I think there are some positives to both approaches, so I will make some changes and add a setting called something like sqlfluff.execInTerminal and default it to false. You could then set that to true to have the extension behave the same as it did in v0.1.8.

RobertOstermann commented 2 years ago

Ok, I have published v0.2.0 that add the setting sqlfluff.format.execInTerminal. Setting this to true should allow the extension to perform the same as it did in v0.1.8 and solve the Mojibake problem. I also made a final effort to fix the mojibake issue with the current iteration, so could you give it a try with sqlfluff.format.execInTerminal = false and let me know if you still see the text garbling? I doubt my changes fixed that, but maybe we'll get lucky.

yamap55 commented 2 years ago

Thanks for trying. I think I will be able to try it on Monday and will report back with the results.

yamap55 commented 1 year ago

MOJIBAKE occurred in sqlfluff.format.execInTerminal = false .

The behavior of sqlfluff.format.execInTerminal = true is puzzling. The following is displayed when saving.

image

If you press the cancel button, the contents of the file will not be changed, but it will be formatted after a moment. I'm sorry for the support, but this setting cannot be used.

target sql file

select a from b;
RobertOstermann commented 1 year ago

Ok, I didn't realize before but the execInTerminal setting only works if formatOnSave is disabled and you manually format the document. In v0.2.1 I changed the setting name to sqlfluff.experimental.format.execInTerminal and added a note about that bug. I will not have the time this week to look further into that problem, but I will give it a look next week.

RobertOstermann commented 1 year ago

I did quickly attempt some fixes in v0.2.2, but there are still some bugs, mainly with the file contents not matching. You can default that to overwrite and it may help. Let me know if that works better for you or you still see issues in 0.2.2

RobertOstermann commented 1 year ago

@yamap55 Have you had a chance to try those changes? Are you still having the Mojibake issue with sqlfluff.experimental.format.execInTerminal = true?