Closed vvuk closed 1 month ago
Hrm. Being able to attach to newly created children seems very complicated, borderline not doable. In order to write to the environment, the magic seems to be the _NSGetEnviron
symbol which returns a char***
(pointer to the location where the array of char*
that form the environment is stored). There are many problems with this path:
_NSGetEnviron
from libSystem
. This is maybe possible, since we have the library info for the target task, and we can load debug info... we should be able to just forward-resolve the symbol ourselves.realloc
, so the memory needs to come from malloc
Spawning a thread in the target and executing code seems like maybe more possible, especially if that code can be dlopen()/dlsym()/call
. Reading through this stuff it pointed me to this threadexec library for which the source is pretty damn complex. But one interesting bit is that it takes the address of a function in the local process, and uses that same address in the target process... so maybe the dyld cache is not mapped at different locations in different processes? I can't believe that wouldn't be the case for security, maybe it wasn't 6 years ago.
Given all this, I'm inclined to just not support attaching to new children when profiling a target process. It should be possible to attach to existing children, though.
For profiling child processes, we could poll proc_listchildpids
, if it doesn't have too much overhead. There's a Rust wrapper in remoteprocess::Process::child_processes
. We'd miss the very beginning of new processes, but it's probably good enough for most use cases.
And for profiling system-wide, if it's even possible with acceptable overhead, we could list all processes using the KERN_PROC_ALL
sysctl, like lldb does in Host::FindProcessesImpl
.
Thanks for investigating this! I was imagining using a sudo subprocess for this, but self-signing samply is a good idea too. We could even have both. And we could have a samply setup
command that does the self-signing for the user, with some kind of interactive wizard.
Spawning a thread in the target and executing code seems like maybe more possible, especially if that code can be
dlopen()/dlsym()/call
. Reading through this stuff it pointed me to this threadexec library for which the source is pretty damn complex.
I wanted to point you at Listing 12-9 from the *OS Internals Book 1, but then I found this post which has further improved on it, and one of the comments on the gist with the full code links to this implementation which is arm64 compatible.
Thanks for investigating this! I was imagining using a sudo subprocess for this, but self-signing samply is a good idea too. We could even have both. And we could have a samply setup command that does the self-signing for the user, with some kind of interactive wizard.
Yep, I was thinking samply setup
too. Followup PR though. For root -- I think self-signing via setup and/or just calling via sudo
directly should be sufficient, I don't think we need a separate subprocess. I swear I read somewhere that at some point Apple will disallow task_for_pid
even for root processes if they don't have the entitlement, but I can't find it. It works right now; same code, just running sudo samply
without code signing.
I wanted to point you at Listing 12-9 from the *OS Internals Book 1, but then I found this post which has further improved on it, and one of the comments on the gist with the full code links to this implementation which is arm64 compatible.
Awesome, thanks! Good to know, though I probably won't go down this route any time soon.
and/or just calling via
sudo
directly should be sufficient
I want to discourage sudo samply record
because this would also run the webserver as root. I think it would also leave a profile.json file that's not writable without sudo.
Implement attaching to existing mac processes. This needs samply to be signed with the debugger entitlement:
ent.xml
:cargo build
codesign --force --options runtime --sign - --entitlements ent.xml target/debug/samply
The basic functionality works, but there is still work to do:
For 2, this might be possible to sort out if can set the dyld preload env var in the target process. I don't know if there's a simpler way to do this, but one idea is to load the preload shared library into the target (I think this is doable?) and then create a new thread in that process that calls an init function from the preload lib which will set the process env.