mandiant / capa-rules

Standard collection of rules for capa: the tool for enumerating the capabilities of programs
https://github.com/mandiant/capa/
Apache License 2.0
526 stars 159 forks source link

FP: decompress-using-aplib using IDA backend #402

Open stevemk14ebr opened 3 years ago

stevemk14ebr commented 3 years ago

The aplib rule currently misses on at least one variation of the aPLib code. https://github.com/secretsquirrel/the-backdoor-factory/blob/master/aPLib/src/64bit/depacks.asm . I cannot explain this failure, the first two checks hit with the comparison to 32000 and the compares against 127 and 128 match as well. Perhaps these constraints are invalid? https://github.com/fireeye/capa-rules/blob/a6d09ec94bfdd0d3e6561fa7eabfa8f79943bb95/data-manipulation/compression/decompress-data-using-aplib.yml#L29-L31

Sample hash is: aba89668c6e9681671a95b3d7a08aae2a067deed2d835ba6f6fd18556c88a5f2

williballenthin commented 3 years ago

@stevemk14ebr is this via the IDA plugin and/or with the cli tool?

Seems to work for me using capa.exe but not with the IDA plugin:

image image image

stevemk14ebr commented 3 years ago

Ida plugin

williballenthin commented 3 years ago

IDA does not detect the loop feature as expected:

image

@mike-hunhoff

should be here:

image

williballenthin commented 3 years ago

ah, there's a tail call to sub_180001208 which IDA considers a distinct function:

image image

williballenthin commented 3 years ago

i wonder if we can add aplib to our open source FLIRT sigs to handle this more robustly

@mr-tz

edit: unfortunately, vcpkg doesn't have an aplib port.

mr-tz commented 3 years ago

https://ibsensoftware.com/download.html provides lib files we can add pretty easily (though manually)

stevemk14ebr commented 3 years ago

please excuse my ignorance but would it be possible to improve capa's loop matching expression to support these tail call cases (at least the most common)?

williballenthin commented 3 years ago

IDA considers the head function and the tail function separate functions, while vivisect (the default analysis backend on cli) considers them one function. the loop feature is found in the tail function, while the other required features are found in the head function. the rule author expected that all features would be found in a single function (like how viv does it) but IDA reports the functions differently.

in general, we've avoided special case logic to smooth over differences among analysis backends - there are different tradeoffs to each and making one act like another is tedious, bug prone, and difficult to maintain. i suppose its possible to shim in something here to merge tail call'd functions but would pull in a lot of additional complexity (now we can no longer rely on IDA's APIs to find functions, how to display results of merged functions in the graph view, etc.).

a better solution would be to tune the rule so that it works across all backends. some ideas come to mind: