decalage2 / oletools

oletools - python tools to analyze MS OLE2 files (Structured Storage, Compound File Binary Format) and MS Office documents, for malware analysis, forensics and debugging.
http://www.decalage.info/python/oletools
Other
2.81k stars 560 forks source link

Make VBA Tools->References Information available #838

Open AndrewJLockhart opened 5 months ago

AndrewJLockhart commented 5 months ago

I am writing a command line tool that uses olevba.py. And need to dump out the information about the Tools->References. The current currently reads this but doesn't make it accessible.

decalage2 commented 5 months ago

Indeed olevba parses references (in VBA_Project.__init__) but does not store and expose the data through the API. For now the only way to get it is to use the option -l debug on the command line, and to filter lines mentioning "reference". For example:

$ olevba order_details_68671777.doc -l debug|grep -i "reference"
DEBUG    reference type = 0016
DEBUG    REFERENCE name: stdole
DEBUG    reference type = 000D
DEBUG    REFERENCE registered lib id: *\G{00020430-0000-0000-C000-000000000046}#2.0#0#C:\Windows\system32\stdole2.tlb#OLE Automation
DEBUG    reference type = 0016
DEBUG    REFERENCE name: Normal
DEBUG    reference type = 000E
DEBUG    REFERENCE project lib id absolute: *\CNormal
DEBUG    REFERENCE project lib id relative: *\CNormal
DEBUG    reference type = 0016
DEBUG    REFERENCE name: Office
DEBUG    reference type = 000D
DEBUG    REFERENCE registered lib id: *\G{2DF8D04C-5BFA-101B-BDE5-00AA0044DE52}#2.0#0#C:\Program Files\Common Files\Microsoft Shared\OFFICE11\MSO.DLL#Microsoft Office 11.0 Object Library
DEBUG    reference type = 0016
DEBUG    REFERENCE name: MSForms
DEBUG    reference type = 0033
DEBUG    REFERENCE original lib id: *\G{0D452EE1-E08F-101A-852E-02608C4D0BB4}#2.0#0#C:\WINDOWS\system32\FM20.DLL#Microsoft Forms 2.0 Object Library
DEBUG    reference type = 002F
DEBUG    REFERENCE control twiddled lib id: *\G{00000000-0000-0000-0000-000000000000}#0.0#0##
DEBUG    reference type = 000F

It would be possible to make that data available to the API, but that requires some work.

Just out of curiosity, what is your tool going to do with those references?

AndrewJLockhart commented 5 months ago

Hi,

I have written a minor change to expose this information through VBA Parser, via VB Project. At the moment it only covers the most common reference type. I'll try to upload a PR this weekend to get your view on whether the approach is likely to be approved. If you think it's a good approach then I can potentially add some test cases and cover some of the other reference types.

I’m writing a tool that uses VBA Parser that will extract some information from workbooks so we can determine how similar workbooks are to each other. The references a workbook uses is a useful data point for that. It also allows us to track where certain COM libraries are used so we can manage them, and has uses for source code control of existing xla's

Regards


From: Philippe Lagadec @.> Sent: Thursday, February 8, 2024 11:39:27 AM To: decalage2/oletools @.> Cc: AndrewJLockhart @.>; Author @.> Subject: Re: [decalage2/oletools] Make VBA Tools->References Information available (Issue #838)

Indeed olevba parses references (in VBA_Project.init) but does not store and expose the data through the API. For now the only way to get it is to use the option -l debug on the command line, and to filter lines mentioning "reference". For example:

$ olevba order_details_68671777.doc -l debug|grep -i "reference" DEBUG reference type = 0016 DEBUG REFERENCE name: stdole DEBUG reference type = 000D DEBUG REFERENCE registered lib id: \G{00020430-0000-0000-C000-000000000046}#2.0#0#C:\Windows\system32\stdole2.tlb#OLE Automation DEBUG reference type = 0016 DEBUG REFERENCE name: Normal DEBUG reference type = 000E DEBUG REFERENCE project lib id absolute: \CNormal DEBUG REFERENCE project lib id relative: \CNormal DEBUG reference type = 0016 DEBUG REFERENCE name: Office DEBUG reference type = 000D DEBUG REFERENCE registered lib id: \G{2DF8D04C-5BFA-101B-BDE5-00AA0044DE52}#2.0#0#C:\Program Files\Common Files\Microsoft Shared\OFFICE11\MSO.DLL#Microsoft Office 11.0 Object Library DEBUG reference type = 0016 DEBUG REFERENCE name: MSForms DEBUG reference type = 0033 DEBUG REFERENCE original lib id: \G{0D452EE1-E08F-101A-852E-02608C4D0BB4}#2.0#0#C:\WINDOWS\system32\FM20.DLL#Microsoft Forms 2.0 Object Library DEBUG reference type = 002F DEBUG REFERENCE control twiddled lib id: \G{00000000-0000-0000-0000-000000000000}#0.0#0## DEBUG reference type = 000F

It would be possible to make that data available to the API, but that requires some work.

Just out of curiosity, what is your tool going to do with those references?

— Reply to this email directly, view it on GitHubhttps://github.com/decalage2/oletools/issues/838#issuecomment-1933918527, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGCCSUFGGUUNJBT3T775EFLYSS2O7AVCNFSM6AAAAABC5RO34KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZTHEYTQNJSG4. You are receiving this because you authored the thread.Message ID: @.***>