decalage2 / oletools

oletools - python tools to analyze MS OLE2 files (Structured Storage, Compound File Binary Format) and MS Office documents, for malware analysis, forensics and debugging.
http://www.decalage.info/python/oletools
Other
2.93k stars 563 forks source link

ooxml.py - TypeError: reading file objects must return bytes objects - python3.7 #504

Open xambroz opened 5 years ago

xambroz commented 5 years ago

Affected tool: ooxml.py

Describe the bug When running the tests in python3.7, the ooxml.py has problems reading the oletools-0.54.2b/tests/test-data/msodde/harmless-clean-2003.xml

File/Malware sample to reproduce the bug

How To Reproduce the bug

Expected behavior Test should pass without errors.

Console output / Screenshots

test_valid_xml (tests.msodde.test_basic.TestReturnCode)
check that xml leads to 0 exit status ... /home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/oletools/crypto.py:244: ResourceWarning: unclosed file <_io.BufferedReader name='/home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/tests/test-data/msodde/harmless-clean-2003.xml'>
  'encrypted.'.format(some_file, exc))
ResourceWarning: Enable tracemalloc to get the object allocation traceback
FAIL
test_file (tests.msodde.test_csv.TestCSV)
test simple small example file ... ok
test_regex (tests.msodde.test_csv.TestCSV)
check that regex captures other ways to include dde commands ... ok
test_texts (tests.msodde.test_csv.TestCSV)
write some sample texts to file, run those ... ok
test_matches (tests.msodde.test_blacklist.TestBlacklist)
check a long list of examples that should match the blacklist ... ok
test_nomatches (tests.msodde.test_blacklist.TestBlacklist)
check a long list of examples that should match the blacklist ... ok

======================================================================
ERROR: test_xml (tests.msodde.test_basic.TestDdeLinks)
check that dde in xml from word / excel is found
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/tests/msodde/test_basic.py", line 164, in test_xml
    field_filter_mode=msodde.FIELD_FILTER_BLACKLIST)
  File "/home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/oletools/msodde.py", line 974, in process_maybe_encrypted
    result = process_file(filepath, **kwargs)
  File "/home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/oletools/msodde.py", line 941, in process_file
    return process_excel_xml(filepath)
  File "/home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/oletools/msodde.py", line 892, in process_excel_xml
    for _, elem, _ in parser.iter_xml():
  File "/home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/oletools/ooxml.py", line 502, in iter_xml
    for event, elem in ET.iterparse(handle, events):
  File "src/lxml/iterparse.pxi", line 209, in lxml.etree.iterparse.__next__
  File "src/lxml/iterparse.pxi", line 194, in lxml.etree.iterparse.__next__
  File "src/lxml/iterparse.pxi", line 222, in lxml.etree.iterparse._read_more_events
TypeError: reading file objects must return bytes objects

======================================================================
FAIL: test_valid_xml (tests.msodde.test_basic.TestReturnCode)
check that xml leads to 0 exit status
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/tests/msodde/test_basic.py", line 48, in test_valid_xml
    self.do_test_validity(join(BASE_DIR, 'msodde', filename))
  File "/home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/tests/msodde/test_basic.py", line 104, in do_test_validity
    .format(found_error, filename))
AssertionError: Unexpected error reading file objects must return bytes objects from msodde for /home/mambroz/rpmbuild/BUILD/oletools-0.54.2b/tests/test-data/msodde/harmless-clean-2003.xml

----------------------------------------------------------------------
Ran 61 tests in 49.901s

FAILED (failures=1, errors=1)
Test failed: <unittest.runner.TextTestResult run=61 errors=1 failures=1>
error: Test failed: <unittest.runner.TextTestResult run=61 errors=1 failures=1>

Version information:

xambroz commented 5 years ago

This could be related to https://github.com/decalage2/oletools/pull/483

decalage2 commented 5 years ago

Hi @xambroz, I merged #483 so could you please check if you get the same error with the latest dev version 0.55? I cannot reproduce the bug so far on my Windows machine, those tests run fine.