koute / bytehound

A memory profiler for Linux.
Other
4.49k stars 193 forks source link

problem tracking child after `libc::fork` #80

Open s1341 opened 2 years ago

s1341 commented 2 years ago

It seems that there are issues following libc::forks, no output file is created for the child process.

koute commented 2 years ago

This is expected and is not a bug.

Commenting out this line will probably make it work.

s1341 commented 2 years ago

Ok. Thanks. Perhaps it should be noted somewhere?

koute commented 2 years ago

Well, ideally it'd probably be the best to just have an env var to configure this behavior.

You don't want it to follow fork by default since that would in a lot of cases generate a lot of garbage files when the application you're profiling temporarily calls into subcommand, but I suppose in some cases you'd want to enable it.

s1341 commented 2 years ago

I can make a pr to add an env var. What is the format of existing env vars?

koute commented 2 years ago

I can make a pr to add an env var. What is the format of existing env vars?

https://github.com/koute/bytehound/blob/master/preload/src/opt.rs#L121

s1341 commented 2 years ago

hrm. Disabling that line did not work.

koute commented 2 years ago

Okay, wait, by "child after libc::fork" did you actually mean "after fork + exec" (which what was I assumed), or do you really mean after fork only? It will indeed not work if you want to profile forked processes without any exec. That is currently unsupported and would be significantly more complicated to implement than just uncommenting a single line.

s1341 commented 2 years ago

yeah. I mean fork without exec :(

What would I need to do to get it working for that?

koute commented 2 years ago

The whole internal state would have to be reset and essentially reinitialized: the queues and caches would have to be cleared, the global locks all forcibly recreated to make sure they're unlocked, the processing thread would have to be spawned again, a new output file would have to be opened, the shadow stack for fast unwinding would have to be flushed, etc. In theory it should be possible, but it's probably going to be quite a bit of work.

s1341 commented 2 years ago

Can you think of a quick workaround?

If not, I guess I could look into doing this work...

s1341 commented 2 years ago

Can you give me a little guidance on exactly what I need to do in order to support this scenario? Happy to make a PR, but I don't have time to review all the code...

koute commented 2 years ago

There's no workaround.

Unfortunately I don't really have the time to give a more detailed guidance, but if you want to try your hands at this feel free to do so. This is, however, a very tricky thing to get right. Here's what I'd do:

  1. Start by writing a test for this (see the integration-tests directory).
  2. Add a new environment variable to control this behavior.
  3. Go to where we hook into fork and after the fork has been called when running as a child most of the global state has to be reset and reinitialized (with a few exceptions, e.g. the options don't have to be reinitialized, etc.). Search for every static variable that we have. Now this is tricky because essentially fork might have been called at any time, so at a time some of the global locks might have been held, some of the state might have been in the process of being modified, etc. So for every piece of global state you'd need to decide whether it has to be reinitialized, and if so make sure it is reinitialized correctly in every circumstance. And what makes this tricky is that this is essentially impossible to test for in an deterministic manner, so while an integration test would ensure that this works in the most basic case it wouldn't exhaustively guarantee that it's correct.
s1341 commented 2 years ago

Ok.

Regarding 3: what about hooking fork and ensuring that the state is all in a 'good state' (i.e. nothing is being modified etc.) while we do the actual fork? Also: might it not be an idea to refactor such that there is only a single static variable?

koute commented 2 years ago

what about hooking fork and ensuring that the state is all in a 'good state' (i.e. nothing is being modified etc.) while we do the actual fork?

That's not possible unless we'd have a single global lock, but that would slow down the profiling to a crawl so it's not a realistic solution.

might it not be an idea to refactor such that there is only a single static variable?

That's also not entirely possible. Nevertheless, it wouldn't really change much, because you'd still have to reinitialize everything in an appropriate matter, regardless of whether it's a single static variable or two.

s1341 commented 2 years ago

Ok. I'll take a look at this in the next couple of days.