DissectMalware / XLMMacroDeobfuscator

Extract and Deobfuscate XLM macros (a.k.a Excel 4.0 Macros)
Apache License 2.0
568 stars 116 forks source link

Unexpected Token #108

Closed jhhcs closed 2 years ago

jhhcs commented 2 years ago

This might be related to #101, and could be duplicate of #107 or #106; The following sample causes an unexpected token error with version 0.2.5 of XLMMacroDeobfuscator:

https://bazaar.abuse.ch/sample/218f8fb236a36cf6cfdd9b0f9544f98580ead944b1811744721eb22a7d1c9529/

baderj commented 2 years ago

I think this could be related to #107. In my opinion, the problem is here:

=FORMULA(С1!C15,С2!F3)
         ^
Expected one of: 
        * ROW
        * /\$?([a-qs-z][a-z]?)\$?\d+\b|\$?(r[a-bd-z]?)\$?\d+\b(?!C)/i
        * STRING
        * BOOLEAN
        * LIST_SEPARATOR
        * ERROR
        * EXCLAMATION
        * LBRACE
        * NAME
        * QUOTE
        * NUMBER
        * L_PRA
        * R_PRA

C1 is a valid R1C1 cell reference. I think it is not valid to use that as a sheet name without quotes (so FORMULA('С1'!C15,'С2'!F3). At least when entering such Formulas, the quotes are automatically added.

But QBot also uses multiple FORMULA statements stringed together

=FORMULA()=FORMULA(Fe1!E14, Fe2!I4)=FORMULA()=FORMULA(Fe2!E17, Fe1!B4)=FORMULA(Vvfrb ....

so maybe Excel only uses the right most FORMULA and it does not matter that the rest contain invalid sheet names.

kirk-sayre-work commented 2 years ago

Here's another sample that fails on the same line of code with an unexpected token error:

https://bazaar.abuse.ch/sample/ffa61432d63f42221525dfbe252af32ec4b697e41b23cfa7ff23c97589b0463d/

Both of these samples are recent Emotet samples.

randubin commented 2 years ago

Quakbot similar problem I think: https://bazaar.abuse.ch/sample/fd2715285ac147b7dd78ba66a184d1016af1d54f1be7a789f231a69143298840/ CELL:D7 , FullEvaluation , "True" CELL:D10 , FullEvaluation , CALL("Kernel32","CreateDirectoryA","JCJ","C:\Bduc",0) Error [deobfuscator.py:2586 parse_tree = self.xlm_parser.parse(formula)]: Unexpected token Token('NAME', 'JJCCBB') at line 1, column 37. Expected one of:

pyvain commented 2 years ago

Similar issue with this sample: https://www.joesandbox.com/analysis/608746/0/html

Execution trace

CELL:H10       , FullEvaluation      , False
CELL:H13       , FullEvaluation      , CALL("Kernel32","CreateDirectoryA","JCJ","C:\Uduw",0)
Error [deobfuscator.py:2587 parse_tree = self.xlm_parser.parse(formula)]: Unexpected token Token('NAME', 'JJCCBB') at line 1, column 37.
Expected one of: 
    * MULTIOP
    * ADDITIVEOP
    * CMPOP
    * CONCATOP
    * R_PRA
    * L_PRA
    * LIST_SEPARATOR
Previous tokens: [Token('STRING', '"URLDownloadToFileA,"')]

Suspected issue

One of the cell values (namely cell Vehsrg!I16) which is concatenated to create part of the formula starts and ends with quotes. As shown in this screenshot after execution in Excel: capture2

The value of this cell is itself the result of the execution of an other formula.

But for some reason I can't figure, the value seems to be unwrapped and quotes are lost when building the final formula: capture

Thanks for your awesome tool :slightly_smiling_face:

baderj commented 2 years ago

The original issue with Unexpected Token is not fixed. Here is the output for the sample posted by jhhcs:

python3 deobfuscator.py --file 218f8fb236a36cf6cfdd9b0f9544f98580ead944b1811744721eb22a7d1c9529.xlsm
XLMMacroDeobfuscator: pywin32 is not installed (only is required if you want to use MS Excel)

          _        _______
|\     /|( \      (       )
( \   / )| (      | () () |
 \ (_) / | |      | || || |
  ) _ (  | |      | |(_)| |
 / ( ) \ | |      | |   | |
( /   \ )| (____/\| )   ( |
|/     \|(_______/|/     \|
   ______   _______  _______  ______   _______           _______  _______  _______ _________ _______  _______
  (  __  \ (  ____ \(  ___  )(  ___ \ (  ____ \|\     /|(  ____ \(  ____ \(  ___  )\__   __/(  ___  )(  ____ )
  | (  \  )| (    \/| (   ) || (   ) )| (    \/| )   ( || (    \/| (    \/| (   ) |   ) (   | (   ) || (    )|
  | |   ) || (__    | |   | || (__/ / | (__    | |   | || (_____ | |      | (___) |   | |   | |   | || (____)|
  | |   | ||  __)   | |   | ||  __ (  |  __)   | |   | |(_____  )| |      |  ___  |   | |   | |   | ||     __)
  | |   ) || (      | |   | || (  \ \ | (      | |   | |      ) || |      | (   ) |   | |   | |   | || (\ (
  | (__/  )| (____/\| (___) || )___) )| )      | (___) |/\____) || (____/\| )   ( |   | |   | (___) || ) \ \__
  (______/ (_______/(_______)|/ \___/ |/       (_______)\_______)(_______/|/     \|   )_(   (_______)|/   \__/

XLMMacroDeobfuscator(v0.2.5) - https://github.com/DissectMalware/XLMMacroDeobfuscator

File: 218f8fb236a36cf6cfdd9b0f9544f98580ead944b1811744721eb22a7d1c9529.xlsm

Unencrypted document or unsupported file format
Unencrypted xlsm file

[Loading Cells]
auto_open: auto_open->PFEV!$B$1
[Starting Deobfuscation]
Error [deobfuscator.py:2586 parse_tree = self.xlm_parser.parse(formula)]: Unexpected token Token('__ANON_0', 'С1!C15,С2!F3)=FORMULA(Rvfs1!P22&Rvfs1!H9&Rvfs1!L2&Rvfs1!B15&Rvfs1!B15&Rvfs2!C7&Rvfs2!D11&Rvfs2!E3&С2!F3&Rvfs1!L2&Rvfs2!G5&Rvfs2!I9&Rvfs2!F19&Rvfs3!N14&Rvfs3!E16,B7)=FORMULA(Rvfs1!P22&Rvfs1!J11&Rvfs1!B18&Rvfs1!P11&"UDYQ1"&Rvfs3!N4&Rvfs1!H9&Rvfs1!L2&Rvfs1!B15&Rvfs1!B15&Rvfs2!C7&Rvfs2!D11&Rvfs2!E3&С2!F3&Rvfs1!L2&Rvfs2!G5&Rvfs2!I9&Rvfs2!G21&Rvfs3!N14&Rvfs3!E16&Rvfs1!P13,B9)=FORMULA(Rvfs1!P22&Rvfs1!J11&Rvfs1!B18&Rvfs1!P11&"UDYQ2"&Rvfs3!N4&Rvfs1!H9&Rvfs1!L2&Rvfs1!B15&Rvfs1!B15&Rvfs2!C7&Rvfs2!D11&Rvfs2!E3&С2!F3&Rvfs1!L2&Rvfs2!G5&Rvfs2!I9&Rvfs2!H19&Rvfs3!N14&Rvfs3!E16&Rvfs1!P13,B11)=FORMULA(Rvfs1!P22&Rvfs1!J11&Rvfs1!B18&Rvfs1!P11&"UDYQ3"&Rvfs3!N4&Rvfs1!H9&Rvfs1!L2&Rvfs1!B15&Rvfs1!B15&Rvfs2!C7&Rvfs2!D11&Rvfs2!E3&С2!F3&Rvfs1!L2&Rvfs2!G5&Rvfs2!I9&Rvfs2!I21&Rvfs3!N14&Rvfs3!E16&Rvfs1!P13,B13)=FORMULA(Rvfs1!P22&Rvfs1!J11&Rvfs1!B18&Rvfs1!P11&"UDYQ4"&Rvfs3!N4&Rvfs1!H9&Rvfs1!L2&Rvfs1!B15&Rvfs1!B15&Rvfs2!C7&Rvfs2!D11&Rvfs2!E3&С2!F3&Rvfs1!L2&Rvfs2!G5&Rvfs2!I9&Rvfs2!J19&Rvfs3!N14&Rvfs3!E16&Rvfs1!P13,B15)=FORMULA(Rvfs1!P22&Rvfs1!J11&Rvfs1!B18&Rvfs1!P11&"UDYQ5"&Rvfs3!N4&Rvfs1!H9&Rvfs1!L2&Rvfs1!B15&Rvfs1!B15&Rvfs2!C7&Rvfs2!D11&Rvfs2!E3&С2!F3&Rvfs1!L2&Rvfs2!G5&Rvfs2!I9&Rvfs2!K21&Rvfs3!N14&Rvfs3!E16&Rvfs1!P13,B17)=FORMULA(Rvfs1!P22&Rvfs1!J11&Rvfs1!B18&Rvfs1!P11&"UDYQ6"&Rvfs3!N4&Rvfs1!H9&Rvfs1!L2&Rvfs1!B15&Rvfs1!B15&Rvfs2!C7&Rvfs2!D11&Rvfs2!E3&С2!F3&Rvfs1!L2&Rvfs2!G5&Rvfs2!I9&Rvfs2!L19&Rvfs3!N14&Rvfs3!E16&Rvfs1!P13,B19)=FORMULA(Rvfs1!P22&Rvfs1!J11&Rvfs1!B18&Rvfs1!P11&"UDYQ7"&Rvfs3!N4&Rvfs1!H9&Rvfs1!B15&Rvfs1!I17&Rvfs1!I3&Rvfs1!H13&Rvfs1!P11&Rvfs1!K9&Rvfs1!P13&Rvfs1!P7&Rvfs1!P13,B21)=FORMULA(Rvfs1!P22&Rvfs1!H13&Rvfs1!N4&Rvfs1!H13&Rvfs1!H9&Rvfs1!P11&Rvfs1!P15&Rvfs1!H9&Rvfs1!P20&Rvfs3!D3&Rvfs3!J6&Rvfs3!F11&Rvfs3!P8&Rvfs3!B5&Rvfs1!P15&Rvfs1!P13,B23)=FORMULA(Rvfs1!P22&Rvfs1!F4&Rvfs1!H13&Rvfs1!E6&Rvfs1!E11&Rvfs1!F4&Rvfs1!K23&Rvfs1!P11&Rvfs1!P13,B31)') at line 1, column 10.
Expected one of: 
    * L_PRA
    * STRING
    * R_PRA
    * BOOLEAN
    * LBRACE
    * NUMBER
    * NAME
    * ROW
    * LIST_SEPARATOR
    * ERROR
    * /\$?([a-qs-z][a-z]?)\$?\d+\b|\$?(r[a-bd-z]?)\$?\d+\b(?!C)/i
    * EXCLAMATION
    * QUOTE
Previous tokens: [Token('L_PRA', '(')]

Files:

[END of Deobfuscation]
DissectMalware commented 2 years ago

Can confirm that the issue of using colname as sheetname still remains in xlsm.

This issue is fixed for xlsb format (https://github.com/DissectMalware/XLMMacroDeobfuscator/issues/107)

DissectMalware commented 2 years ago

Now it is fixed for xlsm format as well (https://github.com/DissectMalware/XLMMacroDeobfuscator/commit/90a58f4a88676ee75db1581394b9503cd4f65e75)

image

Please update xlmdeobfuscator from repo

baderj commented 2 years ago

Can confirm that commit 90a58f4a88676ee75db1581394b9503cd4f65e75 works!