Profiler changes - Githubissues

michaeleisel commented 1 month ago

Separate C++ largely into its own class to minimize Objective-C++ and associated weirdness
Use more idiomatic, fast C++ (e.g., use objects directly instead of new pointers which also avoids memory leaks)
Make performance more consistent (use deque over vector for O(1) worst-case insertion)
Only suspend threads one at a time
Fix current/potential issues: avoid static initialization order fiasco, minimize reentrant section

As far as performance goes though, it's largely bottlenecked both before and after this PR by FIRCLSReadMemory, which calls vm_read_overwrite. I imagine it's because it doesn't trust the address to be safe to directly dereference without causing a segfault. It currently takes about 0.1 ms for my test case for a single sampling of the main thread, although it's largely dependent on stack depth I imagine. Replacing it with a memcpy makes it take about 0.04ms, at which point thread_get_state is the next-biggest culprit.

michaeleisel commented 1 month ago

testing looks good. i will say, profiling in general (with or without my change) seems off on real devices (not seeing my own functions in the profiling result), but maybe i'm doing something wrong

michaeleisel commented 1 month ago

@noahsmartin i can't merge, can you do it

EmergeTools / ETTrace

Profiler changes #95