scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
232 stars 21 forks source link

REPL is not accepting numeric keyboard input (including alt key combinations) #4711

Closed scabug closed 6 years ago

scabug commented 13 years ago

When I want to enter a unicode code (or an ascii code) with the numpad the REPL is not accepting input. On my keyboard I have to press Fn + Alt and the 'numeric keyboard plus key' and then the hexadecimal code (the digits may be from the numeric keyboard or the normal digit keys.

The REPL is accepting paste from the clipboard so there is a work around to type it in an editor and copy/paste it to REPL.

scabug commented 13 years ago

Imported From: https://issues.scala-lang.org/browse/SI-4711?orig=1 Reporter: DaveScala (davescala) Affected Versions: 2.9.0 Attachments:

scabug commented 13 years ago

huynhjl said: I'm interested to understand more about this issue, but need some help.

I'm trying to reproduce this behavior and I can't even get it to work on cmd.exe. Did you have to set the following setting set to 1 in the registry {{HKEY_CURRENT_USER\Control Panel\Input Method\EnableHexNumpad}}? I was looking at the "In Microsoft Windows" section of http://en.wikipedia.org/wiki/Unicode_input and it would appear that that input method requires it.

Can you give specific examples of unicode characters (with their hex digits) to have useful use cases? Can you confirm they work in a plain cmd.exe windows? If you had to change fonts, code page or anything else, provide this info.

For instance I can't get '£' to paste with your clipboard workaround (it get translated to 'ú' instead). On a plain cmd.exe I can paste it with a right mouse click or I can enter it as ALT-0-1-6-3 combination. Right mouse click and ALT-0-1-6-3 don't work in scala REPL.

scabug commented 13 years ago

DaveScala (davescala) said: No, it wasn't necessary under windows 7. It is already hexnumpad for the real unicode (the so-called Alt-Plus method). I think that was only necessary for Vista (and reboot your pc).

This issue is only about keying in so if you managed to key a glyph in the repl with the numpad and shows up as a square or a question mark, I am happy too, though I think the REPL source code should be modified to handle numpad event key ins inluding alt combinations.

This issue is not about displaying the glyph (because that is another chapter), but if you want an open source free unicode ttf font for the console you can install DejaVu fonts and use the Sans Mono = sans serif, monospaced (so far this one has the best unicode support as far as I know, maybe there is a unicode font in windows office but that is not free, if you know a better one I'd like to know):

I use Deja Vu Sans Mono. This is no console font but you can use it as if it were a console font. Console fonts have special requirements: see: http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q247815

The fonts must meet the following criteria to be available in a command session window: The font must be a fixed-pitch font. The font cannot be an italic font. The font cannot have a negative A or C space. If it is a TrueType font, it must be FF_MODERN. If it is not a TrueType font, it must be OEM_CHARSET. Additional criteria for Asian installations: If it is not a TrueType font, the face name must be "Terminal." If it is an Asian TrueType font, it must also be an Asian character set.

so that is why you cant use every gui font that works for Notepad in the console. Current version of DejaVu Unicode is until version 4.0. Latest Unicode version 6.0 so ot is not upto date with the latest version of Unicode. Just download from http://dejavu-fonts.org/wiki/Main_Page , unzip go to ttf directory select all ttf files, right mouse click install

To make the font working HKLM\Software\Microsoft\WindowsNT\CurrentVersion\Console\TrueTypeFont\000 You already have 0 and 00 name/value pairs for the default fonts (Consolas and Lucida Console) If you already have a 000 then you must add an extra 0 Create a name/value pair type REG_SZ name : 000 value: DejaVu Sans Mono

In the console properties you have then an extra font to select. For Japanese characters I get squares btw so that doesn't work.

Things that do work in my console: ☆ U+2606 (mathematical kleisli star) ∃ U+2203 (mathematical any symbol) € U+20AC (euro sign currency symbol)

The kleisli star looks a bit to small for this font in comparison with any symbol but it works.

But you can use the unihan Unicode database which has a nice look up interface you can search for code and also paste in an unknown character from which you want to know the unicode: http://www.unicode.org/charts/unihan.html

Beware that ALT-0-1-6-3 combination is not really an unicode combination but windows 1252 codepage which is equal to unicode and since it is 1 byte only for the first 256 codes (0-255) except for (128-159), so in this case it works. see http://msdn.microsoft.com/nl-nl/goglobal/cc305145 (the list contains also use cases that probably work with default console fonts)

Since I have a laptop I have the press (Fn) as well to key in on the numpad So if you have a separate numpad you don't have to do that. Alt-Plus method keying in unicode: ☆ U+2606 (mathematical kleisli star) (Fn) ALT + 2 6 0 6 ∃ U+2203 (mathematical any symbol) (Fn) ALT + 2 2 0 3 € U+20AC (euro sign currency symbol) (Fn) ALT + 2 0 a c

This one below is probably better if you don't want to install a new font, kleisli star and any symbol won't work with default console fonts. Note also the differences between the different key-ins for the same glyph: Unicode (Alt-Plus method): ê U+00EA (e circum flex) (Fn) ALT + 0 0 e a Ansi cp 850 on my pc! (old school way): ê (e circum flex) (Fn) ALT 1 3 6 Windows cp 1252: ê (e circum flex) (Fn) ALT 0 2 3 4

Beware that + is on the numpad, not the other + key above = (on my keyboard)

"For instance I can't get '£' to paste with your clipboard workaround (it get translated to 'ú' instead)." Then it's still in some ANSI encoding. Convert to utf-8 without bom in notepad++ or another editor capable of converting encodings.

You can check you ansi codepage in the console with chcp. Mine is 850 http://en.wikipedia.org/wiki/Code_page_850 . For another it's maybe 437. Sofar I didn't have to change the codepage in the console. It is possible with chcp 65001 (for utf-8) but it is not necessary for this issue. I don't know why but maybe windows recognizes utf-8 encoded bytes in the clipboard.

To make the Scala REPL work in unicode utf-8 modify this line in scala.bat and add -Dfile.encoding="UTF-8" so that it looks: set _PROPS=-Dscala.home="%_SCALA_HOME%" -Denv.emacs="%EMACS%" -Dfile.encoding="UTF-8"

scabug commented 13 years ago

huynhjl said: Thank you for this detailed information. I have Win 7 and I had to add the registry key to enable ALT + input method, so not all Win 7 have it configured by default.

Also after setting DejaVu Sans Mono and file.encoding=UTF-8, I tried the CTRL-V on the following

val x = "☆"

but it appears funky on the console:

val x = "☆"�"

Oddly, x.length == 1 so the extra garbage seems to only be a printing artifact. You don't get the same behavior?

What I found out is that jline/jansi relies currently on _getch which does not detect the ALT unicode combinations (input is just ignored). I tested with a little C program that ReadConsoleW reads those ALT combination correctly. It does not work either when pasting by using right mouse click.

We would have to modify jansi to use ReadConsoleW, rebuild and repackage the dll. It probably is going to take a while as well as some help from the jansi maintainer and there may be challenges along the way but I can start down that path...

scabug commented 13 years ago

DaveScala (davescala) said: Today it looks indeed different. Probably something has changed on my windows since then: I have windowsupdate active so whenever there are updates my windows changes. Copying and pasting with ctrl + v the kleisli star is followed by two black question marks on a white background (the squares) like this one http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=%EF%BF%BD scala> ☆�� This also happens if I paste one such a question mark with white background: scala> ���

I also have to change the console to utf-8 with chcp 65001 for the REPL to display the above . If I keep it codepage 850 and start the REPL I have scala> Ôÿå I wrote in my first posting: "Sofar I didn't have to change the codepage in the console. It is possible with chcp 65001 (for utf-8) but it is not necessary for this issue."

But that is not true anymore Now it is necessary to chcp 65001 for the REPL to display the utf-8 character, but it is appended by two strange characters.

For the windows console it doesn't matter which codepage 850 or 65001: somehow it figures out itself.

I guess Microsoft is working on the Unicode libraries.

scabug commented 13 years ago

DaveScala (davescala) said: I think the two characters are actually ctrl v that can not be decoded to unicode and the characters are not truncated from the input http://en.wikipedia.org/wiki/U%2BFFFD In the Windows console pasting by key combination is not passible and is showed as ^V Instead, it is possible to right mouse click and paste the unicode character.

It is not possible to right mouse click and paste in the REPL.

Probably the problem would not occur when keying in directly via the Alt-Plus method (if the REPL accept this)

scabug commented 13 years ago

huynhjl said: After some investigation, I found a solution based on adding jni support to jansi for the following calls: ReadConsoleInputW, WriteConsoleW, GetConsoleOutputCP, SetConsoleOutputCP. I have an open pull request to the jansi maintainer.

The changes to jline are mostly in AnsiWindowsTerminal but changes to scala.tools.nsc.interpreter.ILoop would be required as well to overwrite Console.out.

If you want to test it, I can make available a modified version of jline.jar and you would have to recompile scala from my github fork.

scabug commented 13 years ago

DaveScala (davescala) said: Okay, I like to test it, give a signal when jline.jar is ready.

scabug commented 13 years ago

huynhjl said (edited on Jul 7, 2011 3:07:59 PM UTC): To test

scabug commented 13 years ago

DaveScala (davescala) said (edited on Jul 8, 2011 11:36:33 AM UTC): I tested it in the recompiled scala 2.10 version and everything seems to work. Alt-Plus, Alt-Zero and normal Alt keycodes worked and also right mouse pasting and ctrl + v pasting worked without garbage characters.

Just be sure that DejaVu Sans Mono is selected because sometimes windows defaults to Lucida Console font when creating in another directory a new console.

Recompiling scala was difficult because if the settings of -Xmx was to high the jvm was not created and if too low then during creation of scaladoc I had a outofheapspace memory error. Fortunately the binaries were already created so I could test.

chcp 65001 in the console is not really necessary. It worked also when chcp is 850.

jline.jar also works in 2.9.0.1 so recompiling scala was not necessary or is there a special reason to use the latest version of scala? jline.jar looks backward compatible.

scabug commented 13 years ago

DaveScala (davescala) said: I found it: in 2.10 error messages with unicode characters look cleaner i.e. no extra garbage characters. So in case of an error jline.jar in 2.10 works better.

2.9.0.1

scala> ☆. asInstanceOf isInstanceOf toString

scala> ☆

:14: error: missing arguments for method ☆ in trait Kleislis; follow this method with `_' if you want to treat it as a partially applied funct ion n ☆ � ^ scala> * :14: error: not found: value * * ^ scala> <=< :14: error: not found: value <=< <=< ^ scala> ∃ :14: error: not found: value ∃ � ∃ � ^ scala> 2.10 ==== scala> import scalaz._ import scalaz._ scala> import scalaz.Scalaz._ import scalaz.Scalaz._ scala> ☆. asInstanceOf isInstanceOf toString scala> ☆ :14: error: missing arguments for method ☆ in trait Kleislis; follow this method with `_' if you want to treat it as a partially applied funct ion ☆ ^ scala> * :14: error: not found: value * * ^ scala> <=< :14: error: not found: value <=< <=< ^ scala> ∃ :14: error: not found: value ∃ ∃ ^ scala> >=> :14: error: not found: value >=> >=> ^ scala>
scabug commented 13 years ago

DaveScala (davescala) said: Thanks for fixing. It works. I am ready with testing so if you are ready too then this issue can be closed as fixed.

scabug commented 13 years ago

huynhjl said: Leave this one open. The changes have not been merged in trunk by somebody with commit privileges. The bug is currently assigned to the "community".

scabug commented 13 years ago

DaveScala (davescala) said: Plans for making this feature also available for Linux/Mac? Or is it already working there?

scabug commented 11 years ago

@adriaanm said: We're moving to vanilla jline 2.11 in 2.11 -- see #7604. Please give it a try!

SethTisue commented 6 years ago

closing for staleness. we can reopen if someone 1) verifies that it is still an issue, and 2) verifies that it is actually actionable in scala/scala (as opposed to JLine itself)