DissectMalware / XLMMacroDeobfuscator

Extract and Deobfuscate XLM macros (a.k.a Excel 4.0 Macros)
Apache License 2.0
568 stars 115 forks source link

XLRD2 has issues with malformed ColRelU Entries #15

Closed michaelweber closed 4 years ago

michaelweber commented 4 years ago

Legacy XLS BIFF8 files can only have up to 256 columns (the max column is IV - if you try to define IX or beyond in an XLS2003 document like "=$IX$255", it will throw an error). The BIFF8 format allows up to 14 bits to describe column values though - so if you save a column number like 0x101 - Excel will drop any values above 0xFF and use that instead. So 0x101 is in fact column 1. I created a document which does this for its Formula destinations and while the Excel COM parser works, the XLRD2 parser will crash.

Ex:

Path\To\Python37\Scripts>xlmdeobfuscator.exe -f colBitStuffingDeob.xls -n -2
[Loading Cells]
auto_open: auto_open->Sheet2!$A$1
Traceback (most recent call last):
  File "Path\To\python37\lib\site-packages\XLMMacroDeobfuscator\xls_wrapper_2.py", line 58, in load_cells
    for xls_cell in xls_sheet.get_used_cells():
AttributeError: 'Sheet' object has no attribute 'get_used_cells'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Path\To\Python37\Scripts\xlmdeobfuscator-script.py", line 11, in <module>
    load_entry_point('XLMMacroDeobfuscator==0.1.2', 'console_scripts', 'xlmdeobfuscator')()
  File "Path\To\python37\lib\site-packages\XLMMacroDeobfuscator\deobfuscator.py", line 858, in main
    process_file(**vars(args))
  File "Path\To\python37\lib\site-packages\XLMMacroDeobfuscator\deobfuscator.py", line 821, in process_file
    for step in interpreter.deobfuscate_macro(not kwargs.get("noninteractive")):
  File "Path\To\python37\lib\site-packages\XLMMacroDeobfuscator\deobfuscator.py", line 637, in deobfuscate_macro
    macros = self.xlm_wrapper.get_macrosheets()
  File "Path\To\python37\lib\site-packages\XLMMacroDeobfuscator\xls_wrapper_2.py", line 79, in get_macrosheets
    self.load_cells(macrosheet, sheet)
  File "Path\To\python37\lib\site-packages\XLMMacroDeobfuscator\xls_wrapper_2.py", line 70, in load_cells
    print('CELL(Formula): ' + str(error.args[2]))
IndexError: tuple index out of range

colBitStuffingDeob.xls.zip

doomedraven commented 4 years ago

i have just tested latest version from the repo, and i don't have this error with your file, try to upgrade maybe?

DissectMalware commented 4 years ago

I am also not able to get the error, however the program should face a problem as it considers them as separate columns. I will take a look.

michaelweber commented 4 years ago

GAH - I uploaded the wrong file. That would be why. I uploaded a version after I'd run my deobfuscator on it 🤦. Here's the actual sample.

colBitStuffingExample.zip

michaelweber commented 4 years ago

Though I'm getting the error for both versions of the file after I patch using

pip install -U https://github.com/DissectMalware/XLMMacroDeobfuscator/archive/master.zip

Probably have some sort of local environment issue on my end - not sure exactly what's going on - will need to dig into it a bit.

michaelweber commented 4 years ago

Yeah - I did a fresh install on a linux machine and this seems to work as intended:

administrator@devmachine:/usr/local/lib/python3.5/dist-packages/XLMMacroDeobfuscator$ python3 deobfuscator.py -f colBitStuffing.xls -n -2 -x | grep FORMULA
CELL:A245      , =FORMULA($A$244,$B$5),
CELL:A99       , =FORMULA($A$98,$B$2),
CELL:A617      , =FORMULA($A$616,$B$14),
CELL:A679      , =FORMULA($A$678,$B$17),
CELL:A300      , =FORMULA($A$299,$B$7),
CELL:A1156     , =FORMULA($A$1155,$C$4),
CELL:A670      , =FORMULA($A$669,$B$16),
CELL:A556      , =FORMULA($A$555,$B$12),
CELL:A645      , =FORMULA($A$644,$B$15),
CELL:A1051     , =FORMULA($A$1050,$C$3),
CELL:A392      , =FORMULA($A$391,$B$9),
CELL:A841      , =FORMULA($A$840,$C$1),
CELL:A1209     , =FORMULA($A$1208,$C$5),
CELL:A273      , =FORMULA($A$272,$B$6),
CELL:A608      , =FORMULA($A$607,$B$13),
CELL:A43       , =FORMULA($A$42,$B$1),
CELL:A1214     , =FORMULA($A$1213,$C$6),
CELL:A457      , =FORMULA($A$456,$B$10),
CELL:A221      , =FORMULA($A$220,$B$4),
CELL:A499      , =FORMULA($A$498,$B$11),
CELL:A946      , =FORMULA($A$945,$C$2),
CELL:A352      , =FORMULA($A$351,$B$8),
CELL:A157      , =FORMULA($A$156,$B$3),

For all of these cases the second argument is actually pointing at a much later column than B or C but Excel treats it normal. It looks like this is fixed / might have never been an issue in the first place though. Sorry for the headaches!