Issue itself

The performance hit is caused by syspolicyd running a networked assessment of every executable (including scripts) that hits an Apple server.
The first check has full TLS overhead, but it'll re-use the connection for additional assessments until it idles for 60s.
It caches results (it seems like maybe by inode?), but Nix tends to run quite a few "new" executables per build.
In my own testing (limited scope/scale) the performance hit I'm seeing has been averaging out to ~10-30% across different types of build.

Potential Fix

The only "promising" workaround I've found (i.e., works in manual testing, doesn't have terrible UX, and might be something Nix can transparently implement) is if nix-daemon can disable the syspolicyd assessments (spctl --master-disable), run the entire set of builds that are required to satisfy the user's invocation of nix, and then re-enable them (spctl --master-enable).

This approach works for me manually but non-root users need sudo. Here's an example where I'm using it in CI). I'm not at all familiar with the internals of the nix commands/daemon, so I'm not sure how feasible it is.

Getting a better sense of the performance impact

It'll probably be easier to know if this is worth acting on with more data points.

I took a little time this weekend to find the active projects running macOS+Nix builds on GitHub Actions and write a script that can download and compile the stage run durations across different jobs.

I'm planning to send PRs to a few of these projects to see if they'll take on a temporary comparison job to generate a broader set of data points. If you have a CI job that fits the bill and can spare a few minutes to set it up, I've collected my settings and analysis script in a gist.

abathur commented 4 years ago

Update 1: I got these working, but pre-build-hook only runs when sandbox = true which is probably a non-starter for a general fix (whether adopted by default or recommended).

Update 2: I ran 30 timed runs of a short 7-build invocation using pre-/post-build hooks to disable assesments which averaged 47.9229s/run. Non-hook comparisons: 50.037s for builds with assessments on, 44.196s with them off. This seems to suggest ~266ms total overhead per hook call.

No sooner than I got this posted, @cole-h asked on IRC if pre-build-hook and post-build-hook could handle this. I wasn't aware of these. I assume the answer is yes, and I'll give it a try soon.

It looks like enabling/disabling assessments takes about 15-30ms per invocation on my system, so my interest in just doing it once for the invocation is in avoiding that overhead.

nixos-discourse commented 4 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/macos-catalina-performance-issue/7359/12

abathur commented 4 years ago

Sigh. This is just an update.

I collected enough samples to be fairly sure this workaround doesn't significantly improve performance. (This is the only prospective workaround I'm currently aware of that Nix could implement--there's another that does work, but AFAIK there's no way to enable it without GUI access to the Security & Privacy preferences panel. It has also been causing some reboot problems for me on 10.15.4-5, though it might be fixed in 10.15.6).

I took a little time this weekend to tease out why...

At first I was I optimistic that maybe Apple had mitigated the performance effect in 10.15.5, since I did my initial testing before it was released. So, I went back through to confirm the behavior is unchanged.
It looks like the approach I used (spctl --master-disable) is still doing some part of (but not all) of the assessment, but ignoring the result and reporting a pass. It is nowhere near as effective as adding the user's Terminal app to the Developer Tools exemption in Security & Privacy.
For some sense of scale, I ran 10K iterations of a test procedure that writes a very simple script with a random filename and executes it (this is extreme by design--I don't mean to suggest normal Nix builds will see a difference anywhere near this large):
- in Terminal.app with a DeveloperTools exemption it ran in ~4 minutes
- in iTerm2 with no exemption took nearly 31 minutes
- in iTerm2 using the spctl --master-disable method only cut the runtime down to 27.5 minutes
This unfortunately means the CI runs didn't help me get a very good impression of how it's affecting other projects.

Since I already put together a list of projects that have that have Nix expressions building on macOS in CI, I'll pick a few of them out and run enough tests this week to brute-force collect the data that way...

abathur commented 4 years ago

This has been a PITA to collect data on.

My goal was to take the projects I previously tracked CI runs for and run 30 timed builds with an untimed garbage collection between each.

I had to weed out a few of the projects I used for collecting samples in CI because they were system/dotfile repos that were harder to test local. I also observed repeated hangs and cras/reboots in a few of the remaining builds. I haven't had time to babysit/debug builds lately, so I had to cut a few more out of the results for this, leaving me with good numbers on only 5 of the 12 projects I tracked on CI.

Here's the impact of using the developer tools exemption to disable assessments on the median runtime of 30 builds (on a 2018 i5 MBA) of these 5 projects:

abathur/resholved: 8.6% speedup, from 201.57s to 184.219s
rycee/home_manager: 5.2% speedup, from 157.732s to 149.551s
nixOS/nix: 2.6% speedup, from 563.582s to 548.840s
ttuegel/kframework: No obvious difference (2231.403s vs 2232.856s)
NorfairKing/smos: 4.0% speedup, from 144.111s to 138.291s

Other notes:

Has anyone noticed flaky builds of Nix itself on macOS? (My test script squelched output, but I noticed it get stuck doing no obvious work twice during the Nix build loop but was able to nudge it along with ctrl+c). A few days ago I also run into flaky Nix installs (see https://github.com/NixOS/nix/issues/3605#issuecomment-674581842) on macOS 10.14+ on travis-ci, and I do wonder a little if they're the same issue. I'll set it to build Nix in a loop verbosely and see if anything shakes out.
I've previously noted that the Developer Tools exemption appears to cause the system to lock up on restart and eventually hard-reboot, and this is still true on 10.15.6. Despite the performance improvement, I don't think we can recommend it to users.

I'll follow this comment with a bit of a thread summary.

abathur commented 4 years ago

I feel like this has mostly run its course for now; assuming this may go fallow until Apple changes something with these systems again, I'll summarize where this stands so that it's easier to catch up later :)

Thread summary

Catalina appears to have introduced a networked executable assessment. It does generally cache results, but the particulars of its implementation and how Nix works means that Nix builds can generate a fair amount of assessment overhead.
I originally identified two potential ways around the issue; disabling the assessments with spctl --master-disable, and adding the user's "terminal" app to the Developer Tools exemption in the Privacy panel. (White lie: it sounds a third option is using profiles to do this, but I gather it requires an MDM?) Unfortunately, I've come to the conclusion that neither of these (at least, in their current form) are worth recommending.
- After collecting hundreds of CI build samples and designing some more specific tests, it's obvious that spctl --master-disable only disables some fraction of the assessment process and has very little relative impact on the overhead they add. (Edit: confirmed this in latest Big Sur beta, too)
- The DevTools exemption provides a decent speedup, but it also: requires a GUI to enable, currently has no support for prompting users to add the exemption, requires that you're actually running the builds from the GUI terminal, doesn't appear to work for daemon/mulit-user installs at all, and it also causes a long hang/crashe on reboot on both of the devices I've used it on. (Edit: confirmed this in latest Big Sur beta, too)
The ~worst-case synthetic test I ran in https://github.com/NixOS/nix/issues/3789#issuecomment-668285880 demonstrated something like an 87.1% speedup¹ by using the Developer Tools exemption to disable the executable assessments.
The speedup (median over ~30 builds each) I observed in https://github.com/NixOS/nix/issues/3789#issuecomment-678592997 from disabling assessments on 5 real projects ranged from ~0-8.6% (avg ~4.08%). I guess, given the very small sample size, that most Nix-builds would run somewhere from 0-15% faster with assessments disabled. This isn't probably terribly meaningful on average individual machines, but it seems like a Big Deal for some workloads.

@arianvp I've exhausted the options I'm aware to no avail. Curious if you think (just from the summary above) it's worth trying to pursue this with the dev-rel contact? I waffle between having really low expectations for this accomplishing anything, and thinking there are a lot of small things they could do that could all independently have a big impact here.

¹ A big part of the performance penalty is network overhead and I have no sense of how well geo-distributed the server(s) fielding these requests are. Some reports in the initial HN thread about this suggested fairly long international response times, so it may be worth noting that I'm in Houston and the benefit of circumventing these checks might vary by location, network, etc..⏎

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

stale[bot] commented 2 years ago

I closed this issue due to inactivity. → More info

wojciech-kulik commented 1 day ago

I'm having a similar issues. syspolicyd completely killed my MacBook multiple times. I had to hold the power button because nothing was clickable anymore. I just saw that syspolicyd is consuming a lot of CPU. Another time when it started using a lot of CPU I was able to preview what was going on. I checked the process in the activity monitor and its open files and it turned out that all paths are /nix/store/.... I'm using macOS 15.

NixOS / nix

darwin: Performance impact of syspolicyd assessments on nix builds #3789

Issue itself

Potential Fix

Getting a better sense of the performance impact

Thread summary