REditorSupport / languageserver

An implementation of the Language Server Protocol for R
Other
580 stars 93 forks source link

code formatting encoding problem #388

Open xinstein opened 3 years ago

xinstein commented 3 years ago

I'm using vscode in windows 10, when I format the code, Chinese characters inside the code (including comments) are transformed into <U+1231>-like form. I've tried on another computer and no such problem occurs, I've played around with quite a few ENVIRONMENT VARIABLES without any effect. I'm wondering what config options or env vars are related to this issue, thank you very much!

renkun-ken commented 3 years ago

Looks like your system locale is not consistent with your encoding in R?

What happens if you use the following:

styler::style_file("your-file.R")
xinstein commented 3 years ago

Looks like your system locale is not consistent with your encoding in R?

What happens if you use the following:

styler::style_file("your-file.R")

That works fine, no <U+> produced. Same file gets messed up by the vscode formatting command

xinstein commented 3 years ago

Looks like your system locale is not consistent with your encoding in R? What happens if you use the following:

styler::style_file("your-file.R")

That works fine, no <U+> produced. Same file gets messed up by the vscode formatting command

Any suggestions? How should I configure the environment to make it right? Thank you~

randy3k commented 3 years ago

We only support UTF-8 files. It does sound that the R environment that language server is running on has an incorrect locale. Could you report the following

Rscript -e "Sys.getlocale()"
xinstein commented 3 years ago

We only support UTF-8 files. It does sound that the R environment that language server is running on has an incorrect locale. Could you report the following

Rscript -e "Sys.getlocale()"

My file is certainly UTF-8 encoded, see the snapshot image

Here's the output of Sys.getlocal() :

>Rscript -e "Sys.getlocale()"
[1] "LC_COLLATE=Chinese (Simplified)_China.936;LC_CTYPE=Chinese (Simplified)_China.936;LC_MONETARY=Chinese (Simplified)_China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_China.936"

However, the following command handles Chinese CORRECTLY.

Rscript -e "styler::style_file('my_r_file.R')"

I'm guessing the problem is the encoding of communication channel (I can't name it) between vscode and languageserver.

xinstein commented 3 years ago

I already have the following lines in my {Documents}/WindowsPowershell/profile.ps1 {Documents}/WindowsPowershell/Microsoft.PowerShell_profile.ps1 {Documents}/WindowsPowershell/Microsoft.VSCode_profile.ps1

$PSDefaultParameterValues['*:Encoding'] = 'utf8'

$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = New-Object System.Text.UTF8Encoding

These have effect on Windows Terminal App as well as built-in powershell client, even the integrated terminal in vscode.

But I can't determine it has any effect on VSCode itself, nor do I know if this is what's causing my problem at all. I don't know which terminal has the languageserver been running in

renkun-ken commented 3 years ago

Looks like languageserver session and your typical R session does not have the same locale since language server is not started from a terminal with specific encoding settings.