Open xnoreq opened 5 years ago
About the terminal type
That's because webssh
creates a pseudo tty with hardcoded terminal type xterm
for every ssh connection.
About the encoding problem
Probably your browser doesn't use UTF-8
as the decoding type.
You can check the browser console to see what encoding it uses.
xterm-256color
is less commonly supported than xtem
.Why not make it configurable? xterm.js supports xterm-256color
.
In my web browser I can see this in the log:
The deault encoding of your server is ANSI_X3.4-1968
This makes no sense since on the server the default locale is configured correctly in /etc/locale.conf:
LANG=en_US.UTF-8
which is loaded by /etc/profile.d/locale.sh.
With every other client this works correctly and after login I get:
$ locale
LANG=en_US.UTF-8
So it looks like the default encoding detection does not work or does something non-standard that is not compatible.
What kind of server you are using?
What is the output of command locale charmap
?
GNU/Linux, kernel version 5.2
$ locale charmap
UTF-8
That is weird.
webssh
uses the command locale charmap
to detect the default encoding of the server being connected.
If the output of this command is UTF-8
, then the log in your browser console should look like
The deault encoding of your server is UTF-8
Terminal type is configurable now. You can pass a terminal type via url.
http://localhost:8888/?term=xterm-256color
1. Why not make it configurable? xterm.js supports `xterm-256color`. 2. In my web browser I can see this in the log: `The deault encoding of your server is ANSI_X3.4-1968`
This makes no sense since on the server the default locale is configured correctly in /etc/locale.conf:
LANG=en_US.UTF-8
which is loaded by /etc/profile.d/locale.sh.With every other client this works correctly and after login I get:
$ locale LANG=en_US.UTF-8
So it looks like the default encoding detection does not work or does something non-standard that is not compatible.
Here is my locale
:
$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
I guess probably your locale is not configured correctly.
No, it's the same on my system, but above I just pasted the first line.
The full output:
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
$ locale -a
C
en_US.utf8
POSIX
$ locale -m
ANSI_X3.110-1983
ANSI_X3.4-1968
ARMSCII-8
ASMO_449
BIG5
BIG5-HKSCS
BRF
BS_4730
BS_VIEWDATA
CP10007
CP1125
<snip>
T.61-8BIT
TCVN5712-1
TIS-620
TSCII
UTF-8
VIDEOTEX-SUPPL
VISCII
WIN-SAMI-2
WINDOWS-31J
$ locale -c charmap
LC_CTYPE
UTF-8
$ locale charmap
UTF-8
If the log in you browser console is The deault encoding of your server is ANSI_X3.4-1968
, then the output of the command locale charmap
on your server should be ANSI_X3.4-1968
.
But it is UTF-8
. I have even started webssh on the same user as in all the commands above.
Can you show me the whole log in your browser console when you connect to this server?
I've added some debug messages to handler.py and it looks like the environment is not loaded correctly.
At the time locale charmap
is executed env
returns very few environment variables and LANG is missing.
The problem is that locale charmap
is sent as direct command through SSH and on my system this means the executing shell is a non-interactive and doesn't read /etc/profile.
Even if LANG was set, this way of detecting the encoding is wrong anyway, as it requires knowing the charset to decode the answer... the answer that you need to know for decoding in the first place.
This is why terminals let the user configure the encoding, which is the correct way to do it, with the default being UTF-8 on pretty much any modern terminal.
Have you tested it on other systems?
Here is a related issue, https://github.com/huashengdun/webssh/issues/21.
Also can you run this command python -c "import sys; print(sys.stdout.encoding)"
on your special server?
Tested on ubuntu 19 with latest kernel 5.3, the default encoding detection works.
There's no reason to test on other systems, as I've pointed out what's going on.
I dug a bit deeper though: on Debian, bash is not only patched to detect that it runs non-interactively under ssh and therefore executes bashrc (which doesn't happen in a "normally" compiled bash and doesn't necessarily set LANG anyway), in Debian the system's LANG is also "injected" into ssh shells through PAM regardless if they're (non-)interactive or (non-)login shells.
Neither is necessarily true on non-Debian or related systems.
Also, as I've explained, the way you try to detect the encoding is wrong anyway. You get an encoded response that contains the encoding needed to decode it in the first place. Since you don't know the encoding you just fall back to UTF-8 anyway, which is behavior that will break on non-UTF-8 systems.
--
Why don't any of these problems happen with normal ssh
? Because ssh and sshd, if configured that way (and are again by default on Debian), will send the client's environment variables (like LANG) to the server which accepts them. See SendEnv/AcceptEnv in ssh(d)_config.
But that again is not a given on all systems, and not always desired anyway.
In this case, the client has to set the LANG for the command itself like so: LANG=en_US.UTF-8 command
.
This is also how you can properly query for available locales: LANG=C locale -a
because now you'll get an answer that is encoded in a known encoding: ASCII in this case. ANSI_X3.4-1968 to be precise.
You get an encoded response that contains the encoding needed to decode it in the first place. Since you don't know the encoding you just fall back to UTF-8 anyway, which is behavior that will >break on non-UTF-8 systems.```
This is because I know the output of locale charmap
only contains ascii
characters.
For ascii
characters, enconding with different encodings will get the same bytes.
And decoding the result bytes with different encodings will get the same string.
Also can you tell me what kind of system(what flavour and what edition) do you use?
Just tested on centos 7, the default encoding detection also works. Until now I have tested two kinds of Linux flavour (Debian and Redhat) and they all work.
Well, I'm researching the same problem (SSH Shell Encoding) which brought me here (Well, actually, Google brought me here, but anyway
Based on the information that I grabbed from this issue, I think maybe you can try to run locale charmap
within xtermjs
console rather than directly on server(?). At least xtermjs
console is interactive so you should be able to get the correct result there.
But as @xnoreq has suggested, that's NOT how it should be done. Maybe you need to provide a method to allow user to configure the encoding by themselves. I know I will be doing that after reading all comments here, so yeah, I recommend it :)
Also ....
Here is a related issue, #21.
I don't think this two is related. The Issue #21 is caused by unsupported encoding label.
The TextDecoder
only supports encoding from this list, and en_IN
is not on the list.
You cannot simply feed the output from a SSH command directly to a JavaScript function and expect everything will work just right. Maybe do a mapping?
Hope it helps :)
Maybe you need to provide a method to allow user to configure the encoding by themselves.
Already provided, you can configure an encoding in your url.
http://localhost:8888/#encoding=gbk
Well, I'm researching the same problem (SSH Shell Encoding) which brought me here
I just searched Google with "SSH Shell Encoding", I don't see any result related.
Can you show me some links which are related to this issue?
Also can you tell me what kind of server(flavour and edition) you run on which you met the same problem?
Oh, the keyword was 'ssh encoding "locale charmap"'.
I was trying to figure out whether or not it's a good idea to send locale
programmatically to server in order to detect it's encoding, and found out it isn't. Just here to share my findings, sorry if I bothered you.
OK so which dicussion tells you that running command locale charmap
is not a good way to detect the encoding?
Actually I never expect that command locale charmap
can work on all platforms.
At least I have tested on Linux systems of Debian and Redhat flavour and they all work.
Can you tell us what server you run on which you meet this problem?
So that everyone can test it.
The TextDecoder only supports encoding from this list, and en_IN is not on the list. You cannot simply feed the output from a SSH command directly to a JavaScript function and expect everything will work just right. Maybe do a mapping?
new TextDecoder('en_IN')
This line code will blow up if the encoding is not a valid one. Seems you don't even read my JavaScript code, how could you comment like this?
Oh, sorry just deleted your comment by accident. Here is your comment copied from my email.
First, let me clarify this: I'm not a user of your software. I'm researching this topic, not your software. I come here because that Google search, and I've confirmed what I expected, so I thought maybe I should share some of mine findings as well.
The thing is this, based on the small portion of the SSH specs I have read, as far as I can tell, unlike Telnet, it does not provide any method for the two parties to negotiate charset encoding. To me, it implies that user have to setup that encoding by themselves before connection is made.
Hope this could resolve some confusion created by me :)
Oh, sorry just deleted your comment by accident.
No problem :)
First, let me clarify this: I'm not a user of your software. I'm researching this topic, not your software. I come here because that Google search, and I've confirmed what I expected, so I thought maybe I should share some of mine findings as well.
The thing is this, based on the small portion of the SSH specs I have read, as far as I can tell, unlike Telnet, it does not provide any method for the two parties to negotiate charset encoding. To me, it implies that user have to setup that encoding by themselves before connection is made.
Hope this could resolve some confusion created by me :)
Like I said before,
I never expect that command locale charmap
can work on all platforms.
At least I have tested on Linux systems (Debian and Redhat) and they all work.
But thanks for your suggestion.
Also please provide me with the links of your findings
and the links of the small portion of the SSH specs
you have read.
Created a simple Python script to get the default encoding of your ssh server for anyone would meet this problem in the future. https://gist.github.com/huashengdun/0af95bdafdce46a6ecbfc628dcd07c29
locale charmap
.If these two results are different, please report the information (flavour and edition) of your server here.
I was a user but since the author apparently doesn't read I have moved on to a better solution. I've explained everything relevant in my comment https://github.com/huashengdun/webssh/issues/84#issuecomment-533901135. Thanks and bye.
I was a user but since the author apparently doesn't read I have moved on to a better solution. I've explained everything relevant in my comment #84 (comment). Thanks and bye.
I did read your comment. I think your explanation is reasonable. But it is just a theory and it may be outdated. I have already tested on two kinds of Linux systems (Debian and Redhat) and the current encoding detection works well.
And you still have not provided me with the detailed information of your server.
I only know your server is kinda of GNU/Linux, kernel version 5.2
.
In that way I cannot reproduce the error as you described.
This is my last response, because I've spent enough time on this.
But it is just a theory and it may be outdated.
You gotta be kidding. Everything is factual and current information.
I have already tested on two kinds of Linux systems (Debian and Redhat)
In other words, you either didn't read or understand my comment.
In that way I cannot reproduce the error as you described.
Actually, I have given all the information (PAM env setting LANG, bash with SSH_SOURCE_BASHRC) that is needed to reproduce the error. On top of this, I have explained multiple times why how you're detecting server encoding is simply logically wrong/contradictory.
And that's why I'm unsubbing, sorry.
Thanks for your discussion and your time.
You are a very funny guy. You just tell me an explanation that why my current encoding detection method is wrong. But you don't tell me the information of the actual server you run on which this problem occurred. Are you working for CIA that you run a system of which the information cannot be uncovered?
Actually I have already tested on two kinds of Linux systems (Debian and Redhat) and the current encoding detection works well. Those two results contradict your theory.
On top of this, I have explained multiple times why how you're detecting server encoding is simply logically wrong/contradictory.
For pure ascii
characters,
There is no difference between bytes.decode('utf-8')
and bytes.decode('ascii')
Did see my explanation?
Actually, I have given all the information (PAM env setting LANG, bash with SSH_SOURCE_BASHRC) that is needed to reproduce the error.
My app works with the real OSes not a pure theoretical environment built base on those information you gave me. How could I know the real OS works just like the way you describe? As a fact, Linux systems of Debian and Redhat I've already tested(limited editions) don't work in that way.
Tested it on FreeBSD, current encoding detection still works.
After changing charset to GBK
, also works.
But it failed on MacOS. Seems It don't send env LANG
back.
But I don't care cause almost no body uses it as a server.
Until now I've tested several different systems including Debian, Ubuntu, Centos, FreeBSD, MacOS. The encoding problem brought in the issue only happens on MacOS (tested on macOS v10.13.6). The encoding detection method currently being used works on all the other systems listed above.
I am going to close this issue now.
If you meet this encoding problem that the app can't detect the default encoding of your server, you can simply pass an encoding via the url like this:
http://localhost:8888/?encoding=utf-8
OK, I notice that the current encoding detection method detects the system-wide character encoding, not the encoding of user level configuration that the user prefers to use.
Seems correct as the Features section says "Auto detect the ssh server's default encoding".
The code have been updated. Now I am using two commands to try to grab the encoding set by the user.
ssh -t <user>@<host> '$SHELL -ilc "locale charmap"'
This command seems work on Debian, Ubuntu, CentOS, MacOS, .
ssh -t <user>@<host> '$SHELL -ic "locale charmap"'
This command is for FreeBSD. The default shell used by FreeBSD has no login option.
Hi xnoreq,
Sorry for my carelessness. I should read your comments more carefully.
Please subscribe this issue. Hope you can see this comment and test your server with my updated solution.
On the server (connected directly through ssh using xterm as terminal):
Running webssh-1.4.5 like so:
Now after connecting through the browser to the same server:
Why not xterm-256color and why is the encoding broken?