dod-cyber-crime-center / DC3-MWCP

DC3 Malware Configuration Parser (DC3-MWCP) is a framework for parsing configuration information from malware. The information extracted from malware includes items such as addresses, passwords, filenames, and mutex names.
Other
300 stars 59 forks source link

Feature Request: allow formatted dictionary/list output in CLI #33

Closed jonbees-ibm closed 1 year ago

jonbees-ibm commented 2 years ago

This issue cropped up when comparing the output of an alternative Cobalt Strike beacon parsing tool to that of an MWCP parser implementation. The alternative tool outputs dictionaries like this:

HttpGet_Metadata                 - ConstHeaders
                                        Host: example.com
                                        Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
                                        Accept-Language: en-US,en;q=0.5
                                 ...

When an Other metadata object with the same data is added to a report, the MWCP CLI output looks like this:

Tags    Key                Value
------  -----------------  --------------------------------------------------------------------------------
        httpget_metadata   {'ConstHeaders': ['Host: example.com', 'Accept: text/html,application/xhtml+
                             xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language: en-US,en;q=0.5'] ... }

Having to condense all of the keys and values into a single string and printing it out like any other string is not ideal for human-readability. If we could either implement support for printing list/dict items on different lines with visual nesting like the alternative tool I included an excerpt from or just use pprint on those objects before writing them to the CLI output, that would significantly improve the usability of the output for any parser which has objects it needs to output containing a set of keys that are not known beforehand.

dc3-tsd commented 2 years ago

The metadata.Other element only accepts strings, bytes, integers, or booleans as values. Anything else will get casted to one of those types (usually strings). This is why you are seeing the results formatted like that. It is the results of wrapping the results in str().

We strongly suggest creating a new class for HTTPHeader that would require at least a name and value. This would allow you to add any number of header entries. You may want to add an additional value or wrap them into an HTTPHeaderSet class if grouping headers is an important use case.

e.g:

import attr
import mwcp
from mwcp import metadata
@attr.s(**metadata.config)
class HTTPHeader(metadata.Metadata):
    name: str
    value: str

report = mwcp.Report()
with report:
    report.add(HTTPHeader(name="accept", value="text/html,application/xhtml"))
    report.add(HTTPHeader(name="accept-language", value="en-US,en;q=0.5"))
print(report.as_text())
---- HTTP Header ----
  Name                  Value        
--------              -----------  
accept                text/html,application/xhtml
accept-language       en-US,en;q=0.5

If you end up going this route, we would welcome the contribution.

dc3-tsd commented 1 year ago

Closing this issue due to age and inactivity.