dotnet / fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
https://dotnet.microsoft.com/languages/fsharp
MIT License
3.87k stars 779 forks source link

Typing / sending unicode characters in FSI is broken #7733

Closed SchlenkR closed 2 years ago

SchlenkR commented 4 years ago

Since F# 4.7, typing or sending unicode chars in FSI results not in the typed char, but in an unexpected one.

Example:

Typing let über = 42;; in F# 4.5 works; in F# 4.7, it doesn't.

image

The behavior is also present using VSCode + Ionide (sending text via Alt ENTER).

SchlenkR commented 4 years ago

Any news on this one? This breaks quite a lot of my working scripts.

forki commented 4 years ago

@KevinRansom any chance you can look at this? This is really annoying in languages that are not English

cartermp commented 4 years ago

Another case here: https://developercommunity.visualstudio.com/content/problem/911485/bad-unicode-encoding-from-literal.html

Likely the same underlying issue. We'll treat this as in scope for .NET 5 probably.

cartermp commented 4 years ago

Note that FSI in VS doesn't have this issue.

image

SchlenkR commented 4 years ago

In Rider, it also works:

image

See also: https://twitter.com/auduchinok/status/1207260258483286021?s=20

Eugene Auduchinok @auduchinok

  1. Dez. 2019 They seem to use a different fsi console integration mode, while we're using the same one as in VS.
larjo commented 3 years ago

With F# 5.0 and a freshly installed net 5.0 I still get the error. It is not exactly the same though. It seems the keys I type gets mistranslated in a different way: image

However, if I change the code pages when calling fsi it works!: image

Edit: Codepage 28591 is better than 1252. It works in net 5.0 also: image

larjo commented 3 years ago

And with this setting it works in vscode + ionide as well:

  "FSharp.fsiExtraParameters": [
    "--fsi-server-input-codepage:28591",
    "--fsi-server-output-codepage:65001"
  ]
wildart commented 3 years ago

The problem persist in F# 5.0. If you paste a string that contains larger then 51 symbol, all Unicode symbols starting from 52st position substituted with several instances of '\uFFFD'.

Microsoft (R) F# Interactive version 11.0.0.0 for F# 5.0
Copyright (c) Microsoft Corporation. All Rights Reserved.

For help type #help;;

> "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→";;
val it : string = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→"

// string that was pasted: "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→"
> "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→���";;
val it : string = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→���"

> let ustr = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→";;
val ustr : string = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→"

// string that was pasted: let ustr = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→";;
> let ustr = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→���";;
val ustr : string = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→���"

> →→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→;;

  →→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→;;
  ^

/stdin(5,1): error FS0010: Unexpected character '→' in interaction

> →→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→��� 

// string that was pasted: let ustr = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→ü";;
> let ustr = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→��";;
val ustr : string = "→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→→��"
wildart commented 3 years ago

It looks like the bug is related to terminal emulation. I got the this problem in VS Code internal terminal and Jupyter web terminal.

I run ReadLineConsole.ReadLine function of current console.fs file (slightly modified version to show bad output after Console.ReadKey call), and got the error in VS Code internal terminal.

vs-code

But when I'm running the same function in Alacritty terminal, everything is ok.

alacritty

bonjune commented 3 years ago

I think it has something to do with the method used to send from the editor to interactive shell. I suspect a buffer holding the characters.

I tested with Korean unicode characters. Since I found this issue dealing with Korean named data.

I sent the string of Korean characters to F# interactive using Alt + Enter (send to F# interactive shortcut)

Screen Shot 2021-08-08 at 7 32 23 PM Screen Shot 2021-08-08 at 7 32 47 PM

And it failed to send those when it is longer than a certain length.

Screen Shot 2021-08-08 at 7 34 50 PM Screen Shot 2021-08-08 at 7 35 01 PM

The bug here is that, after a certain position of characters, all the characters are turned into broken unicodes.

But it works completely fine when I use dotnet fsi in MacOS built-in terminal! I think it is a bug related to the VSCode terminal.

cartermp commented 3 years ago

@bonjune this issue still persists for umlauts. As you noticed, hangul doesn't have an issue in dotnet fsi proper. But if you try to do something like let ümlaut = 12 it'll have an encoding issue.

image
bonjune commented 3 years ago

I tried sending the same codes to F# Interactive in Rider on MacOS. It worked well without producing any broken hangul characters. I don't know why.

But when I copy and paste my codes into zsh on the VSCode terminal, it worked fine. But the problem occurs when I copy/paste into F# interactive terminal.

So I suppose this issue has something to do with I/O from VSCode to F# interactive.

I am reading the source code for the console input : https://github.com/dotnet/fsharp/blob/main/src/fsharp/fsi/console.fs

Any idea? 😄

cartermp commented 3 years ago

VSCode has some quirks I think. @Krzysztof-Cieslak may be able to give a hint as to some of them.

dsyme commented 2 years ago

@KevinRansom It looks to me like this was a regression VS2015 --> VS2017 and is still a problem. I don't know why

dsyme commented 2 years ago

I finally looked into this long standing regression.

This change looks suspicious - fixNonUnicodeSystemConsoleReadKey should be considered false and it appears someone falsely thought !fixNonUnicodeSystemConsoleReadKey meant not fixNonUnicodeSystemConsoleReadKey.

dsyme commented 2 years ago

Fixed by https://github.com/dotnet/fsharp/pull/13054

SchlenkR commented 2 years ago

Fixing things by just removing code must have been a pleasure :) Thank you. Shall I retest the fix (e.g. using nightly builds) or is that unnecessary?