Support for configuring character encoding

yanne commented 10 years ago

Currently trying to use non-ASCII characters for input fails but at least the error message pretty clearly states that non-ASCII input is not supported. If output contains things are even worse because reading output just fails.

Errors in output can be demonstrated e.g. by running ls on a directory that contains files/dirs with non-ASCII characters in their name. If ls is run with Execute Command, running the keyword succeeds but the output is not correct. If Write/Read Until Prompt is used, then the latter fails for UnicodeDecodeError.

We should be able to handle both input and output better if we allowed users to set encoding to use. Then we could encode Unicode strings from test data into bytes before passing them to the connection, and similarly could decode bytes when reading output. If we would use UTF-8 by default, things would often work fine out-of-the-box.

This issue was originally opened at Google Code on Oct 17, 2012.

yanne commented 10 years ago

Original comment by agam...@gmail.com on Nov 15, 2012.

Can you kindly let us know when we can expect to have this issue fixed?

yanne commented 10 years ago

Original comment by piliszekm on Feb 26, 2013.

Hello,

are there any estimates on when this will be released?

yanne commented 10 years ago

Original comment by pekka.klarck on Mar 5, 2013.

We don't have any plans for SSHLibrary development at the moment. There might be a release after RF 2.8 in June but I cannot make any promises.

The development is driven mainly by the needs of NSN who is the sponsor of this library. If you happen to work for them, let the development team know via the internal mailing lists. Otherwise your best option is providing patches or sponsoring the development yourself.

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Jul 24, 2013.

This is done by https://code.google.com/p/robotframework-sshlibrary/source/detail?r=93b566e19ae59c526773ad06cf81fa82b6724461

The default encoding for all the upcoming connections can be set on library importing or by using Set Default Configuration keyword. For currently active, already open connection Set Client Configuration can be used. Encoding can also be configured per connection by passing it as an argument (named encoding) to Open Connection keyword.

If no encoding is explicitly configured, UTF-8 is assumed.

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Sep 2, 2013.

There was an issue with character encoding and using Read keyword on Jython. This is now fixed as well by https://code.google.com/p/robotframework-sshlibrary/source/detail?r=28cb7598c0e946fe74b594c3f9c8039b99e3f5ff

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Oct 17, 2013.

This issue was updated by revision 35814e0ff0bc .

Did some minor refactoring related to expiring timeouts and reading from the server output.

yanne commented 10 years ago

Original comment by piliszekm on Nov 5, 2013.

I think that get file is still wrong. I've took the 1.2-devel version and set encoding to proper one in Open Connection and then also in Set Client Configuration. Everything works fine with Read and Write but Get File give me the following error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 13: ordinal not in range(128).

Looks like Get File still sets encoding to ascii.

Not sure about put file but probably there is the same issue.

yanne commented 10 years ago

Original comment by pekka.klarck on Nov 5, 2013.

Good catch @piliszekm! Get File and Get Directory probably should get separate encoding arguments, but default to the connection specific encoding. Should Put File/Directory get them too? Anyway, this stuff probably should get its own issue.

Anssi, what do you think?

yanne commented 10 years ago

Original comment by piliszekm on Nov 6, 2013.

One more thing. When integer is send with some Write keyword error is thrown.

AttributeError: 'int' object has no attribute 'encode'

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Nov 7, 2013.

Thanks for the heads-up! These seem like valid bugs and I'm investigating these.

yanne commented 10 years ago

Original comment by piliszekm on Nov 7, 2013.

Actually I was wrong sending characters with Write keyword that are non ascii or utf-8 is also not possible :( - encoding error.

yanne commented 10 years ago

Original comment by piliszekm on Nov 7, 2013.

In my tests I'm using iso-8859-2 encoding.

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Nov 8, 2013.

This issue was updated by revision 31c3258ab87e .

The bug in writing of non-strings is now fixed.

Also renamed the acceptance tests to have more descriptive names.

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Nov 13, 2013.

This issue was updated by revision 45cd62c88534 .

List keywords now work with UTF-8. Also Get File should work now as the problem with this was that the name of the test file was in NFD (the default in OS X) format.

Also reorganized tests for easier maintainability.

yanne commented 10 years ago

Original comment by pekka.klarck on Nov 13, 2013.

Anssi, is encoding support restricted to UTF-8 or is that the only encoding you have explicitly tested?

yanne commented 10 years ago

Original comment by piliszekm on Nov 14, 2013.

Get file looks better now but still not ok :(. When requesting file I receive error: "FAIL : There were no source files matching ...."

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Nov 18, 2013.

Pekka: Implementation works for any encoding so it's not restricted. The tests use UTF-8 as an example but probably writing a couple of tests to use a different encoding would not hurt though.

piliszekm: How are you using the keyword (what's the source file name and the SSHLibrary configured encoding)? On which OS this happens?

yanne commented 10 years ago

Original comment by piliszekm on Nov 19, 2013.

I'm connecting to AIX box which uses iso-8859-2 encoding. Then I'm obtaining the file name from that server using Write and Read Until Prompt keywords. Something like: Write grep ....... ${x} Read Until Prompt ${file} Set Variable ${x.splitlines()[:-1]}

${file} is returning file name with the whole path. And i'm sure this file exist.

then I just use Get File keyword: ssh.Get File ${file}

I've just check that it is the same if I hardcode the file name in Get File keyword.

Files that are failing contains national characters like "ä".

Tests are executed from Win7 and from RedHat.

yanne commented 10 years ago

Original comment by jussi.ao...@gmail.com on Nov 20, 2013.

Which version of sshlibrary are you using?

Can you try with the current development version?

yanne commented 10 years ago

Original comment by piliszekm on Nov 20, 2013.

The same is happening on the newest devel version.

Test: * Settings * Library SSHLibrary WITH NAME ssh

* Test Cases * test ${IP} Set Variable .... ${USER} Set Variable .... ${PASS} Set Variable ... ${PROMPT} Set Variable .... ${ENCODING} Set Variable iso-8859-2 Open Connection ${IP} prompt=${PROMPT} encoding=${ENCODING} timeout=20 Login ${USER} ${PASS} Write export PS1="testprompt$" Set Client Configuration prompt=testprompt$ encoding=${ENCODING} Read Until Prompt Sleep 1s Read Write ls /tmp/test* Read Until Prompt ssh.Get File /tmp/test.txt ssh.Get File /tmp/testä.txt

Console output: ....

20131120 11:46:17.698 : INFO : 1: export PS1="testprompt$ 20131120 11:46:18.699 : INFO : Slept 1 second 20131120 11:46:18.701 : INFO : " testprompt$ 20131120 11:46:18.818 : INFO : ls /tmp/test* 20131120 11:46:18.828 : INFO : /tmp/test.txt /tmp/testä.txt testprompt$ 20131120 11:46:19.336 : INFO : [chan 2] Opened sftp connection (server version 3) 20131120 11:46:21.702 : INFO : '/tmp/test.txt' -> 'D:\Dokumenty\Praca\Automation\Finland\test.txt' 20131120 11:46:23.004 : FAIL : There were no source files matching '/tmp/testä.txt'. Ending test: Test.test

yanne commented 10 years ago

Original comment by piliszekm on Nov 20, 2013.

By newest I mean 2414e8551934

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Nov 20, 2013.

piliszekm: Yes, succeeded in reproducing this. Fixing.

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Nov 21, 2013.

This issue was updated by revision b890cb86655d .

All inputs are now encoded. Also the file names should be now handled correctly but this still needs some heavy testing and tests.

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Nov 25, 2013.

This issue was updated by revision d2e42d5ff95c .

Creating file and directory paths on the remote should now handle the encoding correctly.

yanne commented 10 years ago

Original comment by anssi.sy...@eficode.com on Nov 25, 2013.

piliszekm: I ran the tests in latin2-environment (with SSHLibrary encoding defined as 'latin2') and they seem to pass now. Do you get any errors?

yanne commented 10 years ago

Original comment by piliszekm on Nov 25, 2013.

nope, I think it is fine now :).

yanne / api-test

Support for configuring character encoding #60