decalage2 / oletools

oletools - python tools to analyze MS OLE2 files (Structured Storage, Compound File Binary Format) and MS Office documents, for malware analysis, forensics and debugging.
http://www.decalage.info/python/oletools
Other
2.88k stars 564 forks source link

olevba - full header not extracted with VBA_Parser.extract_macros from classes or forms #89

Open ntextreme3 opened 8 years ago

ntextreme3 commented 8 years ago

There appears to be some information from the header that is not extracted for classes and forms (shown below), and at the same time looks like the extracted info contains more attributes than the exported info.

Any way to access the information currently missing from extraction?

CLASSES From Exporting in Excel:

VERSION 1.0 CLASS
BEGIN
  MultiUse = -1  'True
END
Attribute VB_Name = "Sheet1"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True

From VBA_Parser.extract_macros():

Attribute VB_Name = "Sheet1"
Attribute VB_Base = "0{00020820-0000-0000-C000-000000000046}"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Attribute VB_TemplateDerived = False
Attribute VB_Customizable = True

FORMS From Exporting in Excel:

VERSION 5.00
Begin {C62A69F0-16DC-11CE-9E98-00AA00574A4F} frmConvert 
   Caption         =   "Conversion Form"
   ClientHeight    =   1440
   ClientLeft      =   45
   ClientTop       =   375
   ClientWidth     =   4710
   OleObjectBlob   =   "frmConvert.frx":0000
   StartUpPosition =   1  'CenterOwner
End
Attribute VB_Name = "frmConvert"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Option Explicit
...

From VBA_Parser.extract_macros():

Attribute VB_Name = "frmConvert"
Attribute VB_Base = "0{C2E93836-344F-40D2-B2C7-70019BC91FC6}{321A03D5-D354-4085-9A5E-89A10A061EEC}"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Attribute VB_TemplateDerived = False
Attribute VB_Customizable = False
Option Explicit
...
decalage2 commented 8 years ago

olevba just extracts the content of the macro source code and decompresses it as-is. It looks like Excel exports other attributes on top of it. I do not know how to get these. Is it useful?

ntextreme3 commented 8 years ago

The classes/forms can't be imported properly without that. So for class modules it looks like I can just add

VERSION 1.0 CLASS
BEGIN
  MultiUse = -1  'True
END

to the beginning of all of them, and they work fine. Not exactly sure what the version 1.0 class etc actually means here. But forms contain info about the caption, height, etc.

Background on use case: I'm using a system where we "deploy" a VBA tool, add "commit" type message, and it makes a copy of it before moving it to "production". I recently created a working add-in that exports all of the modules, classes, forms, sheets etc to github folder. I can then rebuild this from github using the exported files.

I was thinking about looping through the historical versions, exporting the code, and taking the existing version notes and sending as git commit -- to load the history of tool into github. I could still do this looping of files in VBA, it just requires all of the files to be opened etc. It would be nice if there was a way to access this data from the streams.

Some tools don't use classes or forms, so I can probably still try this approach for some. Thanks !

sancarn commented 5 years ago

@decalage2 I know this is many years later, but wanted to mention in case anyone else found this thread. These attributes can literally break code if not present. E.G. This is some vba code I made recently:

VERSION 1.0 CLASS
BEGIN
  MultiUse = -1  'True
END
Attribute VB_Name = "StringBuilder"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Public Str As String
Public JoinStr As String
Public Function Append(Str As String) As Variant
Attribute Append.VB_UserMemId = -5
  Str = Str & JoinStr & Str
End Function
Public Function Create() As StringBuilder
  Set Create = New StringBuilder
End Function
Private Sub Class_Initialize()
  Str = ""
  JoinStr = vbCrLf
End Sub

You can use the string builder as follows:

  Dim sb As Object
  Set sb = New StringBuilder
  sb.[This is a really cool multi-line ]
  sb.[string which can even include    ]
  sb.[symbols like " ' # ! / \ without ]
  sb.[causing compiler errors!!        ]

Debug.print sb.str

Another common attribute is VB_UserMemID= -1 :

VERSION 1.0 CLASS
BEGIN
  MultiUse = -1  'True
END
Attribute VB_Name = "Coolection"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Dim pCol as Collection
Public Property Get Item(index as integer) As Variant
Attribute Item.VB_UserMemId = -1
  Debug.print "Getting item " & index
  Item = pCol(index)
End Function
Public Sub Add(a,b)
  pCol.add a,b
End Sub
Public Function Create() As Coolection
  Set Create = New Coolection
End Function
Private Sub Class_Initialize()
  pCol = new Collection
End Sub

You can use the code as follows:

Dim col as Coolection
col.add "Cool Value", "Cool Key"
Debug.Print col.Item("Cool Key") '==> "Cool Value" (Nothing Special Here)
Debug.Print col("Cool Key") '==> "Cool Value"   (Black Magic!)

So as you can see, it completely adds a lot of syntactic sugar, which wouldn't be the case if these values were abscent.