Add note on potential capa FP detection to documentation

mr-tz commented 7 months ago

          Actually, before the issue is closed, is there a false positive reporting process for Windows Defender and the other three engines that can be pursued to prevent this causing trouble for new users, or is this expected enough that it can be put perhaps into an FAQ for the program on this github project?

Originally posted by @RionEV in https://github.com/mandiant/capa/issues/2025#issuecomment-1988802299

RionEV commented 7 months ago

Seeing as I was the one to raise the concern it's only fair that I at least try to contribute to the issue's suggestion;

Insofar as ensuring end-user peace of mind I think it might be worth explaining how a false positive might come about in laypersons' terms, how to verify personally that it is not harmful using an external validation service (such as VirusTotal) and - arguably most important, though admittedly outside of this project's scope - how to actually access VirusTotal's analysis correctly as opposed to just looking at its first page.

With the help of your community I've convinced myself beyond reasonable concern that capa is a safe program at this time, based on the information that VT and this issue tracker provides, but a brief boilerplate in the description that summarizes this can sidestep this issue being raised a fourth time.

I admit that I don't have much experience actually using github, I'm ultimately a hobbyist, but my attempt for what its worth would read as follows:

----------[start]----------

Why does capa trigger my antivirus? Is it safe?

capa's purpose is to analyse the capabilities of a potentially malicious application or file. In order to do that, though, it needs to contain parts of the data it's designed to detect as a basis of comparison. The release version of capa is packaged with embedded rules using PyInstaller, and these rules contain information with similar characteristics to existing wild malware. As it stands, we know that capa has triggered false positives before, but listing exactly what those positives might be would essentially involve reading the contents of the packaged rules.

How can I be certain that capa is behaving correctly?

The first and most obvious way of ensuring that it behaves correctly is to only download it from this repository's release branch, or indeed to build it yourself. Mandiant's only goal is to protect individuals and enterprises from cybersecurity threats, and to that end we have ensured that this project is open source so that professionals can be certain of the project's transparency. For users who are not as versed in cybersecurity, however, who may be individuals simply using capa for independent offline analysis of suspicious files, the prospect of reading through the entire repository can be daunting. Instead, we would recommend submitting the distribution for analysis by a service such as VirusTotal, which will test the program in a virtual machine completely separate from your own computer for added peace of mind.

Understanding the VirusTotal output

VirusTotal does not just test a sample against a large number of antivirus engines and report whether the sample was detected by them, but also does some online capability analysis. In a way, it is similar to capa, but limited by sample submission filesize restrictions and of course, an internet connection. So, to read the report that Virustotal produces as part of its analysis, and to verify that the behaviours it identifies are as expected: 1) Navigate to the Behaviour tab on the sample page after submitting it

2) Navigate to the Full Reports section. We would recommend reading the Zenbox report as it provides a more readable breakdown.

3) Examine the indicators given by the report.

This information can be used in order to understand what it believes capa is doing "Behind the scenes," so to speak.

As of release v7, the most common detection type by antivirus engines are those of a Trojan malware or a Stealer malware, and many engines report capa as being able to perform C&C (Command and Control) Actions. As an example, producing a report from the executable released as v7.0.1 ( which you can read here ) shows that capa does not perform any networking - it contains URLs in its memory, but does not "phone home" or call a controller like a Stealer or C&C Malware would.

The basics of what to look for when analysing and verifying capa as an end-user.

While these techniques are not necessarily applicable to the reports that capa itself produces, they are a suitable introduction to them. 1) Look for suspicious network activity or access. In capa's case, there is none, even though its pre-packaged rulesets contain a large array of URLs that could be detected by VT. 2) Look for suspicious DLL or EXE "dropped files". Dropped files refer to the files placed on the local machine capa was run on. capa does not drop anything other than yml files containing its rules to the Temp directory of the host computer's user account in AppData.

As ever, no warranty is expressed or implied, but while it can indeed be troubling to turn to a security program for additional protection and find that a system would mark it as untrustworthy, it is our hope that through these steps you can decide with certainty whether you will use capa as part of your suite of tools. As with all potentially suspicious applications, the last line of defence is simply not using them - and if all other avenues fail, we can lastly give our end-users the assurance that capa is a portable application. If you don't run it, it does not do anything at all. If you have downloaded the application, inspected it using VT and remain dissatisfied, it can be safely deleted without any ill effect.

----------[end]----------

I would say that my main goals here are to address the same concerns that led up to me submitting Issue #2025, but I of course welcome any changes that would make this more appropriate for your documentation. Rather than submit it directly to the repository I want to put it up for discussion here first; you all don't need me ruining this much hard work with clumsy additions.

It may be valuable to continue the second-to-last section (what to look for) by explaining how each part of the false detection happens if capa does anything peculiar that is picked up that is not part of false-detection of its rulesets, like process virtualisation or somesuch. VT is not exactly kind to capa when analysing it, giving some on-the-face concerning Mitre ATT&CK characteristics like defense evasion and so on. I know that the purpose of this "(somewhat) FAQ (singular)" page is to give a brief crash course on why it's safe to an inexperienced user such as myself, but actually reading how capa works might be a little dense for some. Halfway down the page it's made abundantly clear to a user with cursory experience that it disassembles and analyses control flow, but to someone just getting into the field this might be a novel concept.

I mean, for the purposes of this discussion you should probably be inclined to assume I showed up on earth yesterday like some barely-tech-savvy Mr. Bean, but anything has to be better than nothing. Even if the final form of this documentation looks nothing like what's been offered above I'll at least be comfortable knowing I tried to contribute to a program that has very literally protected by own home workstation more than once in ways that other providers have not.

mr-tz commented 7 months ago

Wow, great! Thanks for the extensive proposal and consideration!

mandiant / capa