HeinleinSupport / olefy

olefy - oletools verify over TCP socket
Apache License 2.0
43 stars 7 forks source link

UnicodeDecodeError: 'ascii' codec can't decode #9

Closed toto4ds closed 4 years ago

toto4ds commented 4 years ago

Hi,

As I think problem in russian symbols, because "olevba3 -a -j -l error" work fine:

{ "vba_filename": "Лист3.cls", "subfilename": "0L7RgdGC0YPQv9C7Li54bHM=X=", "ole_stream": "_VBA_PROJECT_CUR/VBA/Лист3", "code": null }

In olefy we get: rspamd python3[1123]: olefy ERROR default_exception_handler Fatal error: protocol.eof_received() call failed. rspamd python3[1123]: protocol: <main.AIO object at 0x7f0468d61cf8> rspamd python3[1123]: transport: <_SelectorSocketTransport fd=7 read=polling write=<idle, bufsize=0>> rspamd python3[1123]: Traceback (most recent call last): rspamd python3[1123]: File "/usr/lib/python3.7/asyncio/selector_events.py", line 823, in _read_ready__on_eof rspamd python3[1123]: keep_open = self._protocol.eof_received() rspamd python3[1123]: File "/usr/local/bin/olefy.py", line 172, in eof_received rspamd python3[1123]: out = oletools(self.extra, tmp_file_name, lid) rspamd python3[1123]: File "/usr/local/bin/olefy.py", line 104, in oletools rspamd python3[1123]: out = bytes(out.decode("ascii").replace(' ', ' ').replace('\t', '').replace('\n', ''), encoding="ascii") rspamd python3[1123]: UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 1499: ordinal not in range(128)

thx

mmgit commented 4 years ago

seems to be this line:

out = bytes(out.decode("ascii").replace(' ', ' ').replace('\t', '').replace('\n', ''), encoding="ascii")

Is it safe to use "utf-8" here?

I tested it with an .XLS file using utf-8 characters in macro names. That worked.

c-rosenberg commented 4 years ago

Thanks for your investigation. I've changed all decodes to utf-8 in our test system.