decalage2 / oletools

oletools - python tools to analyze MS OLE2 files (Structured Storage, Compound File Binary Format) and MS Office documents, for malware analysis, forensics and debugging.
http://www.decalage.info/python/oletools
Other
2.88k stars 564 forks source link

Macros exist according to olevba but extract_all_macros returns an empty list (0.53.1) #501

Open clawfrank opened 4 years ago

clawfrank commented 4 years ago

This is probably just a usage question.

I inherited a spreadsheet with a bunch of code in it. I am trying to extract it so I can put it in version control. If I open the macro list for ThisWorkbook, I see a list of macros, and if I open Visual Basic from the Developer tab there are multiple entries under Microsoft Excel Objects, Forms, and Modules. I set up python 2.7 and oletools 0.53.1 (these are the only versions I have access to right now), and ran this script, which essentially amounts to:

import os import shutil import oletools.olevba3 as VBA_Parser

vba_parser = VBA_Parser(workbook_path) has_macros = vba_parser.detect_vba_macros()? # returns true all_macros = vba_parser.extract_all_macros() # returns a list of length 0

Am I correct that if detect_vba_macros returns true, then extract_all_macros should return something? Am I doing something wrong here?

decalage2 commented 4 years ago

Well, I think this can happen because detect_vba_macros() checks whether there is a VBA project present in the file. However, that VBA project may have no VBA source code modules, and in that case extract_all_macros() returns nothing. But that is weird, given you see entries under Modules.

Is it possible to share the file, or at least the output of olevba on the file, with debug logging? For that, you may run olevba file.xls -l debug >output.txt

clawfrank commented 4 years ago

I ran it in debug mode and that helped me find a way to get it working. I replaced 'olevba3' with 'olevba' (and removed an extra argument from open()). I am not very familiar with python but I think I recall that 2.x and 3.x are significantly different, which could explain why there are 2 namespaces/implementations? Does this sound about right?

decalage2 commented 4 years ago

Well, in recent versions of olevba (0.54) I managed to merge the python 2 and 3 versions in a single script, still called olevba. olevba3 is now just a redirection to olevba, for backward compatibility. But since you are using an older version (0.53?), it was still two different versions of the code, with slight differences. In any case, please upgrade to the latest version if possible, and tell me if you still experience the same behaviour. I'd like to know if it's a bug or not. Can you also send me the debug output so that I have a look? Thanks.

clawfrank commented 4 years ago

I'm working on a closed network so it's pretty hard to get things from one place to the other, in either direction. It's not feasible for me to upgrade the software. Wish I could, I'd be using a newer version of Python as well.

It is probably feasible to get you some of the output from 0,53.1, but given that the APIs are not merged in this version, and I'm on python 2, I doubt it will tell you anything surprising. output.txt doesn't have much content in it. It's about 6 lines long, lists the filename twice and ends with Type: OpenXML. The console output is much longer and ends with an attempt to open a stream from a closed OLE file.

Do you want it anyway? If so, it'll have to wait for another day. I'm not sure where the tool is that does it and the person that would know is not here today.