volatilityfoundation / volatility3

Volatility 3.0 development
http://volatilityfoundation.org/
Other
2.58k stars 443 forks source link

Unify Linux plugins representation of pids #981

Open eve-mem opened 1 year ago

eve-mem commented 1 year ago

Is your feature request related to a problem? Please describe. The "pid" as shown by different plugins is not consistent. Either displaying the pid or tgid value from the tast_struct.

For example the pslist plugin uses tgid: https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/plugins/linux/pslist.py#L83

While malfind will report pid: https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/plugins/linux/malfind.py#L83

And in fact if you wanted to use a pid filter for plugins, that is actually using pid rather than tgid: https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/plugins/linux/pslist.py#L63

Describe the solution you'd like Picking one and sticking with it across the plugins. I'm happy to make the changes in a PR when a choice is made.

I think the biggest question is what would actually be preferred?

tgid is the most technically correct "pid" to display, but using that value and calling it pid in plugin outputs might be confusing to some? It'll catch people out when writing plugins or working with volshell.

However if all changes to pid are swapped to tgid and now none of the plugins report a pid result. That will be confusing to people that doesn't yet know what tgid is. They might want to find a process by a pid they have from logs etc, but it's now not obvious how to do that in vol.

I'm not sure personally what the right answer is, although I think I lean to changing all plugins to use the tgid value but reporting it as pid. We're using "pid" to mean "unique id for a process to use between plugins and different sources of data" and tgid does exactly that. It's what ps displays after all. It's just slightly confusing as the task_struct has a field called pid that might seem more natural to use.

We could go to the extreme of adding a get_pid() function to the task_stuct in the Linux extensions to keep it all central?

This is probably quite a minor point, but I'd personally love to see consistency across the plugins.

I'd be really interested in your thoughts. Thank you.

ikelos commented 1 year ago

I don't know enough about the structures to know the difference between a tgid and a pid. I'm happy to defer to @atcuno decision on this front?

ikelos commented 1 year ago

There was a suggestion at one point of potentially abstracting the concept of a process from any operating system into a single class in some way, but I dunno if anyone has the energy or desire to dig into all the details of doing that (or whether it would be a very useful thing to do)?

eve-mem commented 1 year ago

I've been thinking about a generic way of handling processes (maybe worth discussion in a different "issue"?)

It's interesting. I think could work, but you'd need to be able to get to the actual process objects for the different OSes quickly or it could get annoying.

Really it could be a nice set of convenience features and it could be expanded over time, something to abstract away the way different OSes do things. I would think at a minimum you'd want something like:

That list could then get expanded over time. Arguments and environment variables also feel like they might be easy enough to add.

Things like file handles etc might be difficult to add, but we could stick with the current per OS way of dealing with that for now?

So would a minimal process object like that be useful enough to be worth doing? It would make things like yara, strings, and any other generic analysis plugins easier to make. It doesn't help with OS specific plugins, but I guess that's how they are already, so probably not making things worse?

I think at the moment I lean towards thinking it might be useful to do, but what do others think?

ikelos commented 1 year ago

I believe we already have a semi-consistent way of adding process layers. It might be we just define an asbtract interface, and make sure the _EPROCESS and whatever the linux equivalent is, both support it. Then we've have named methods (get_pid, get_ppid, get_starttime, etc) all of which would respond in a uniform manner.

I'm a little worried about chaining/listing processes, I'd like to be able to extend that later, but I'm not sure it should be part of the first iteration.

I agree I don't think it'll make things worse. Whether it makes things better or just adds an extra layer, I'm not sure. It should be transparent to existing plugins, because they'd continue to use the native objects, it's just that those native objects would also begin inheritting from the abstract process class. I figure we mock it up so people can see what it helps with and what it doesn't and then figure out if it's worthwhile (so invest a bit of time, but don't go overboard, it might still get thrown away). How does that sound?

gcmoreira commented 1 year ago

I don't know enough about the structures to know the difference between a tgid and a pid. I'm happy to defer to @atcuno decision on this front?

Hey! sorry for getting into this conversation :) ... my two cents: I know it's confusing, I've been there. This is particularly true in Linux, where the terminology holds distinct interpretations based on whether it's in user-space or kernel-space. The term PID typically refers to its representation in user-space. Within the kernel it's referred to as TGID. And, to finish screwing it up, the user-space TID corresponds to the kernel name of PID. User-Space Kernel
PID task_struct.tgid
PPID task_struct.real_parent.tgid
TID task_struct.pid

Then, what from user-space is called a 'process' has TGID == PID. And a user 'thread' or 'lightweight process' TGID != PID.

All the above is explained in the getpid() syscall comment here.

Having said that, IMHO I wouldn't recommend changing the names in the plugins to their user-space ones. In my opinion, doing so could lead to more confusion among developers and potentially introduce more errors in the code. I would use the kernel names in our source code.

On the other hand, I like the idea of methods responding in a uniform manner for all the OSes. Maybe, it would be good to have a look at a cross-platform process libraries such as Python's psutils and see what they have done there. Although, I'm pretty sure they didn't use the kernel names :)

ikelos commented 1 year ago

So perhaps we deviate from the specific nomenclature for the OS, but just go with "what people would expect" for the uniform interface for all OSes?

eve-mem commented 1 year ago

@gcmoreira yeah thats exactly what I mean.

The linux pslist plugin returns thr vaule from tgid, but calls it PID for the user. I think that's probably a good thing.

The issue I mean is that other plugins don't do this.

For example this scenario:

as a user I run pslist, find some process I'm interested in and note the PID. I then try the lsof plugin with a pid filter for that pid, but I get no results.

That's not because there isn't any, but because the pid filter function uses pid rather than tgid.

So if they don't match it causes confusion. It seems like a good idea if all plugins return "pid" in the same way.

eve-mem commented 1 year ago

@ikelos I missed your comment somehow, but I think that is a good idea. What people would expect is exactly what should happen - with consistency across the plugins.

gcmoreira commented 9 months ago

Hey! We need to make a decision on this. It seems you agree with going for 'what people would expect', so let's see what @atcuno thinks about it and if it's ok proceed with that approach.

To prevent our code from becoming overly confusing, I would recommend, as a good practice, being more explicit when naming variables i.e. user_pid = task.tgid instead of just i.e. pid = task.tgid. And then yield from the individual variables like: yield user_pid, user_tid, user_ppid instead of yield task.tgid, task.pid, task.parent.tgid.

Regarding the PID filtering, @eve-mem your example above doesn't make sense to me. Filtering by task.pid actually would never miss any task. Even, if the user confused PID with TID. See the example below. Anyway I do understand what you mean there and agree we have the same issue as with the column names here. Currently, the PID in the --pid argument is meaning filtering by task.pid. And we don't have a way to only filter by task.tgid. However, having only one way/argument to filter by numeric ID, I still think that filtering by user_pid == task.pid is the best way to do it. Let me explain the two alternatives with an example:

USER_PID(task.tgid) USER_TID(task.pid) Name Comment
100 100 process_name1 Thread group leader 1
100 101 process_name1 1st thread of USER_PID 100
100 102 process_name1 2nd thread of USER_PID 100
103 103 process_name2 Thread group leader 2

1) Using user_pid == task.pid , the user wants to filter:

2) Using user_pid == task.tgid, the user wants to filter:

Alternatively, if we add an extra argument we can have:

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 200 days with no activity.

ikelos commented 3 months ago

It's been over six months since we said we need to make a decision. Perhaps we need to be clearer about who we mean by we? 5;P Looks like this is currently waiting on @atcuno. Any chance you could weigh in please Andrew?

eve-mem commented 1 week ago

Hello @ikelos @gcmoreira and @atcuno again,

In @gcmoreira latest PR https://github.com/volatilityfoundation/volatility3/pull/1263 - they've done some great work adding support for threads into lsof and sockstat (awesome stuff - love it!) - but it meant needing to show the difference between pid/tid.

It looks like @gcmoreira chose to label tgid as PID and pid as TID - which is what I personally think is the right way to do it. It matches the pslist plugin and I think makes the most sense personally.

My suggestion with this issue is to essentially do the same for all the plugins so they all work in the same way. Always use tgid as PID and pid as TID. That's it really.

I like @gcmoreira previous comment re the coding practice - that makes sense to me.

To prevent our code from becoming overly confusing, I would recommend, as a good practice, being more explicit when naming variables i.e. user_pid = task.tgid instead of just i.e. pid = task.tgid. And then yield from the individual variables like: yield user_pid, user_tid, user_ppid instead of yield task.tgid, task.pid, task.parent.tgid.

Thanks again :fox_face:

eve-mem commented 1 week ago

If it helps at all - this is what I mean: https://github.com/volatilityfoundation/volatility3/compare/develop...eve-mem:volatility3:linux_unify_pids?expand=1

ikelos commented 1 week ago

Thanks @eve-mem. Could you please submit that as an actual PR please? It all looks ok, but I've got a couple of concerns about what the version number should be from when we go from pid to tgid. The results between runs of the plugin may differ and if so then it should probably be a MINOR bump. (And why in the bash plugin have be bumped the required framework version but not in any of the others?) Lastly there was a change inside a lamba and that breaks the code so that one needs reverting or doing some other way...

eve-mem commented 1 week ago

I can do a PR, but i wasn't sure that people actually wanted to change it up like this yet to be honest.

Also thanks for spotting that lambda mistake <3

ikelos commented 1 week ago

Hehehe, no problem. I figure might as well use a PR, rather than a branch and a bug for discussion? They're pretty much the same thing, and as long as people are happy with PRs getting rejected then there's no real scope for difference? 5:P

eve-mem commented 6 days ago

That makes sense to me. I'll tidy it up a touch and make a PR.

No drama at all if it gets closed rather than merged. I'm not bothered at all.