mandiant / capa

The FLARE team's open-source tool to identify capabilities in executable files.
https://mandiant.github.io/capa/
Apache License 2.0
4.09k stars 512 forks source link

Use more fields to address dynamic address processes #2361

Open yelhamer opened 5 days ago

yelhamer commented 5 days ago

While working on the DRAKVUF sandbox, we noticed that sometimes processes would have the same PID and PPID and would therefore be fused together in the final generated JSON sandbox report. It would be nice to have some type of way to distinguish between processes.

DRAKVUF (the monitor) gets around this by specifying more fields while reporting each api call that was made or file that was accessed. These fields include ts_from (time when the process was created), ts_to (time where process ended), as well as process name. As for the DRAKVUF sandbox, then the devs have now added a new "SEQID" field that's an alphanumeric value that's generated from ts_from and ts_to, so it might be nice to use one of these two ideas to distinguish between processes with the same PID and PPID.

I think this issue has come up in the past, and I think that maybe we could add an extra field (maybe call it inner?) that we could add to the capa.features.address.ProcessAddress class, and then in the case of the DRAKVUF sandbox we could put the newly added SEQID there and use it to tell which process is which. Alternatively, we could register the ts_from and ts_to into ProcessAddress and use it to tell processes apart for all sandboxes?

I am not sure how different sandboxes tackle this issue, so maybe some research is needed to try and find a common ground between all of them that we factor out into the ProcessAddress class. I am using the issue to get a conversation started on it.

mr-tz commented 5 days ago

@doomedraven do you have some insight on how repeating PIDs is handled for CAPE? I remember we did some limited tests for it and did not encounter repetitions.

yelhamer commented 5 days ago

@doomedraven do you have some insight on how repeating PIDs is handled for CAPE? I remember we did some limited tests for it and did not encounter repetitions.

Just mentioning this here as well: I think CAPE reports have a "first-seen" field for each process and I think that maybe that could be it (that would be similar to ts_from in DRAKVUF I guess)

williballenthin commented 5 days ago

sysmon uses a guid associated with each process to deal with this scenario. but i don't think this is exposed in the windows api.

so it does seem like there should be a field that capa can provide sandboxes to differentiate colliding PIDs.

doomedraven commented 3 days ago

@kevoreilly i guess you will be the best person to respond that, can you help with that? as i can provide answer how cape handles pids in analyzer.py but not in capemon

kevoreilly commented 2 days ago

It has always been my understanding that it is not technically possible for a process to have the same pid as its parent. The best source I can find seems to be Raymond Chen: https://devblogs.microsoft.com/oldnewthing/20110107-00/?p=11803 but I have heard this many times over the years, and have certainly never seen anything like it myself.

I would be happy to learn something new and be proven wrong, but my first reaction to hearing "we saw a duplicate pid on drakvuf" is to assume there is something wrong with that setup rather than this being a universal behavior of Windows...

williballenthin commented 2 days ago

@kevoreilly I think the discussion here is about PID reuse, especially when the PIDs of a parent process's children collide after some time. (such as many short lived processes or something over a long period. not PIDs reused at the same moment in time)

kevoreilly commented 2 days ago

Apologies I misunderstood - I see what you mean. A single sandbox run containing two children at different times with the same pid (and parent)... Well I have never seen that either! I would have thought as you suggest the circumstances in which that might occur would be very unusual, creating a very high number of child processes for example. As such I don't really see this as a problem that needs to be solved in cape as I still can't believe it would happen in a run that one would otherwise expect the sandbox to handle. I stand to be corrected though! But I would need to see it to believe it.

yelhamer commented 2 days ago

Sorry but there is a bit of a miscommunication from my part as well.

The issue at hand was with correctly identifying processes. Possible implications of not using adequate means to ID them would be thinking that 2 different processes are the same (in the case of PID reuse), or thinking that a single process is in fact 2 different processes (in the case of PPID spoofing and using PPID:PID to identify processes). For the first implication (PID reuse) I think what @kevoreilly said is sufficient, but I think that there should still be an issue with the second case (PPID spoofing/UAC elevation).

DRAKVUF sandbox previously used PID and ts_from (process first spotted) and ts_to (last seen process) to identify processes, but it has now moved to using a sequential ID (PR: https://github.com/CERT-Polska/drakvuf-sandbox/pull/958). capa identifies different processes using a combination of PPID and PID, which should be an issue in the case of PPID spoofing/UAC elevation since the same process would be split into two (before UAC elevation and after).

It would be nice to see how CAPE handles this issue (identifying processes), so that if it's something similar to what DRAKVUF sandbox does then we could factor that into capa and use it (instead of the current PPID:PID) to identify different processes. If not, then I guess would could add a different optional (inner) field to https://github.com/mandiant/capa/blob/25111f8a950e527783e0a30077ed3960be330b89/capa/features/address.py#L45 and make it so that each sandbox feature extractor could use it to tell different processes apart in its own way.

williballenthin commented 2 days ago

Lots of discussion about the design for a PID-replacement over here: https://github.com/elastic/ecs/issues/672