greenbone / gvm-tools

Remote control your Greenbone Community Edition or Greenbone Enterprise Appliance
https://greenbone.github.io/gvm-tools/
GNU General Public License v3.0
167 stars 90 forks source link

UnicodeDecodeError: 'utf-8' codec can't decode #78

Closed asmaack closed 6 years ago

asmaack commented 6 years ago

When downloading an openvas report in raw XML format over gvm-cli socket with the get_reports command, I get a decoding error. As the error points out, this has to do with data decoding mismatch it seems. See:

$ gvm-cli socket -c --xml "<get_reports report_id=\"795ecf96-7957-4553-a203-30affa1e34e0\" format_id=\"a994b278-1f62-11e1-96ac-406186ea4fc5\"/>"
Traceback (most recent call last):
  File "/usr/local/bin/gvm-cli", line 11, in <module>
    load_entry_point('gvm-tools==1.4.1', 'console_scripts', 'gvm-cli')()
  File "/usr/local/lib/python3.6/dist-packages/gmp/clients/gvm_cli.py", line 213, in main
    result = gvm.read()
  File "/usr/local/lib/python3.6/dist-packages/gmp/gvm_connection.py", line 103, in read
    response = self.readAll()
  File "/usr/local/lib/python3.6/dist-packages/gmp/gvm_connection.py", line 973, in readAll
    response += data.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 1023: unexpected end of data

I looked at gvm_connection.py, line 973, which reads: response += data.decode()

I did some hacking around and then changed line 973 to: response += data.decode('latin-1')

Then everything worked fine and the download will finish successfully. Looks to me there is some encoding/decoding mismatch here. My environment is all standard utf-8 (LANG=en_US.UTF-8). There seems to be no encoding settings available for openvassd, nor in gvm-cli or ~/.config/gvm-tools.conf. Is it the case that data/reports from openvas are returned in mixed encoding, some in utf-8, some in latin-1?

I don't get this problem when viewing the same report/results in the gsad GUI.

gsa: (gsad --version) Greenbone Security Assistant 8.0+beta2 GIT revision d1a83ab88-master

gvm: (gvmd --version) Greenbone Vulnerability Manager 8.0+beta1 GIT revision 3691c0ad-master

openvas-scanner: (openvassd --version) OpenVAS Scanner 6.0+beta2 GIT revision a13b0f7-master

gvm-libs: ~/gvm-libs$ git log commit 58248fdd4752e6073ada8497996a29572b41b10b (HEAD -> master, origin/master, origin/HEAD) ...

gvm-tools: (gvm-cli --version) gvm-cli 1.4.1

Operating system: $ uname -a Linux 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Installation method / source: (packages, source installation) From source following INSTALL

Logfiles

/usr/local/var/log/gvm/gvmd.log:

md   main:WARNING:2018-09-14 10h14.06 UTC:11191: read_from_client_unix: failed to read from client: Connection reset by peer
asmaack commented 6 years ago

Any news on this? - Is this a bug, and if not, is there a workaround?

wiegandm commented 6 years ago

I am not too familiar with the code, but I (and probably gvm-tools as well) would expect gvmd to send UTF-8 encoded XML. Can you pinpoint which part of the report contains the latin-1 encoded text? This could help identifying how the text ended up there and where it should have been encoded.

timopollmeier commented 6 years ago

@asmaack, to give you some ideas to check what @wiegandm said:

asmaack commented 6 years ago

No, I'm not using any non-ascii chars in task or target names.

I'll try to regenerate the problem again.

schoekek commented 6 years ago

I have the same problem on pyshell. Herere ist the fix: File: gvm_pyshell.py

316c316
>         file = open(path, 'r', newline='', encoding="utf-8").read()
---
<         file = open(path, 'r', newline='').read()

Pls can confirm this?

asmaack commented 6 years ago

I seem to have deleted the report in question :-( but I'll keep my eyes open in all new reports.

bjoernricks commented 6 years ago

It's expected to get utf-8 encoded responses and also scripts written with utf-8 encoding. Therefore I am closing here now. Feel free to reopen if you still have issues with encoding.