Launching tests takes a very long time on MacOS (an unexplained pause between build and testing)

TheQuantumPhysicist commented 11 months ago

I saw the Gatekeeper issue described out there, but I'm not sure this is the same. I did add the terminal to developer tools. That didn't help.

So, here's the description of the problem and how to reproduce it on the repo where I work

Make sure to remove the build directory with rm -rf target
Run all tests: cargo nextest run --all

Now here's the difference between MacOS and Linux:

Linux: Once the build finishes, the tests launch immediately
MacOS: Once the build finishes, while in terminal, a long pause happens. During the long pause, I can see on the terminal tab title cycling through all crate names. It seems that everything is being executed in sequence to collect test names? Not sure what's going on there. It takes like 20-30 seconds to finish.

This problem in MacOS happens only after a build. Running cargo nextest run --all again right after the first time doesn't cause this long pause to happen again.

sunshowers commented 11 months ago

Thanks for the report! This seems like you have some kind of antivirus that is checking for a binary by its hash the first time it is run. Something like https://github.com/google/santa maybe?

There isn't much nextest can do here. Maybe we can report how long the list phase took.

TheQuantumPhysicist commented 11 months ago

Thanks for the report! This seems like you have some kind of antivirus that is checking for a binary by its hash the first time it is run. Something like https://github.com/google/santa maybe?

There isn't much nextest can do here. Maybe we can report how long the list phase took.

I don't have any anti virus installed. I only have an outgoing firewall.

The same happens with all my colleagues with mac m1 and m2.

Not sure how to tackle this... I'll try to research this a bit.

TheQuantumPhysicist commented 11 months ago

Alrighty. I played a little bit with it, and I understand what's going on. Apparently, gatekeeper does some kind of scanning for programs before allowing the command(s) target/debug/deps/program-deadbeef --list --format terse to run. Well, in that case, there's a way to fix this that needs a little design upgrade.

The problem is that the current design assumes that the aforementioned command is instantaneous, and hence these commands are run sequentially for all crates. But then in large scale software on MacOS, this becomes prohibitively slow, making the execution virtually go in serial. In my case, this step takes longer than the testing (the list phase takes like 30 seconds for me)!

There are a few improvements that can be done here:

The execution of the --list command can be done in parallel for all tests. It seems that it's done now in serial.
The start of tests shouldn't depend on all the --list commands to finish, but there should be a FIFO pipeline that starts the test for a group of tests once its --list command finishes. In other words, the --list command and then execution of tests should be put in one pipeline. The good news is, you already have a very nice, very efficient pipeline created for running the tests. All you have to do is add the running of the --list command to that.

Does that make sense? Would appreciate your opinion there.

sunshowers commented 11 months ago

So adding the terminal to Developer Tools as documented here is supposed to prevent that kind of hash-based check from happening. I'm not sure why it's not working for you. I wonder if something broke in macOS recently.

The execution of the --list command can be done in parallel for all tests. It seems that it's done now in serial.

This is actually done in parallel today: https://github.com/nextest-rs/nextest/blob/c1e61ea9d77c70425706705e64c35c0215536cf9/nextest-runner/src/list/test_list.rs#L251 (it used to be serial but it's been parallel for a while). However, something in macOS might cause this phase to be serialized (e.g. if the hash-based checking acquires a global lock).

The start of tests shouldn't depend on all the --list commands to finish, but there should be a FIFO pipeline that starts the test for a group of tests once its --list command finishes. In other words, the --list command and then execution of tests should be put in one pipeline. The good news is, you already have a very nice, very efficient pipeline created for running the tests. All you have to do is add the running of the --list command to that.

I thought a bit about this and the immediate issue that comes to mind is: what if the list step fails at some point?

TheQuantumPhysicist commented 11 months ago

I wonder if something broke in macOS recently.

Perhaps... there's no way to be sure, since I enabled it today for the terminal (and my IDE).

what if the list step fails at some point?

Well, then the execution stops... I'm even OK with enabling this with a special command line argument. After all, the testing in practice that I do on regular basis is done with a special command in my IDE (I don't type it by hand every time), not in terminal like I'm doing now just to understand the problem. Though please keep in mind that currently the wait to get all these checks done takes longer than the test run. If you got an M1, go ahead and do a cargo nextest run --all on the repo I shared and see for yourself. It's as simple as clone then just run it.

sunshowers commented 11 months ago

I can't repro this on my remote M1 Mac Mini with Monterey 12.5.1. I ran on your repo:

cargo clean
cargo nextest run --no-run --all
cargo nextest list --all

And the second command finished in 1.094 seconds. This doesn't change whether I run it via Terminal.app in VNC or via sshing into the computer.

This is a pretty old version of macOS. I'll try updating it to the newest version.

TheQuantumPhysicist commented 11 months ago

I'm on the latest MacOS, 14.2.

You're right though. Running these commands doesn't produce the issue. To reproduce the issue, for me, I had to do this:

cargo test --all --no-run

Then:

time cargo nextest list --all

And the result:

real    0m58.613s
user    0m1.615s
sys 0m1.500s

I don't understand the specifics of why, but I can guess. Perhaps the nextest build does trigger the scan anyway at the end. Keep in mind that nextest admits right after starting after the cargo test --all --no-run that there's nothing to build.

TheQuantumPhysicist commented 11 months ago

OK I removed the terminal from developer tools, restarted, added it, restarted again, tried again, and the problem is gone!

I have no idea what's going on. But I guess my problem is solved at this point... unless it shows up again.

Apologies for wasting your time.

I'm OK with closing this issue. I can reopen it if it happens again.

sunshowers commented 11 months ago

Sadly macOS is just really janky and buggy in my experience, and its closed source nature makes it hard to tell what's going on. I've tried working with an Apple engineer in the past but that sadly fell off, and it's hard to get a comprehensive accounting of all the different ways in which this can happen. This is compounded by the fact that I don't have a Mac personally, being a full-time Linux user.

I think nextest can still do a better job reporting what's going on, so I'll file an issue for that and then close this one.

nextest-rs / nextest

Launching tests takes a very long time on MacOS (an unexplained pause between build and testing) #1161