Open corneliusroemer opened 2 weeks ago
TODO:
./parallel_cram.sh -j8 tests/functional/parse.t
Error: Directory tests/functional/parse.t does not exist
One caveat: it seems that iqtree creates files in the input file directory. As multiple tests use the same input files, they might conflict. So we should copy the input files to a temporary directory before running. This is done in commit 687bd04
Our tests are revealing a real user interface issue here. Instead of working around it in tests, perhaps we can fix this input directory pollution for good by relocating the temporary alignment file we're already using
https://github.com/nextstrain/augur/blob/3f72c40897a80132099729d5d00a6718e76e0e9e/augur/tree.py#L250
into a temporary directory and thus also avoid having to do this junk
My preference would be to write a much smaller bit of code to translate cram's output to TAP and then use
prove
to run the files in parallel.
Or, alternatively, if we don't want to switch to TAP, modify Cram to support parallel testing. This doesn't seem difficult, from a look at the codebase.
I initially tried parallel but it's less portable. In fact the brew installed parallel I have on my system is the moreutils parallel and not gnu parallel. Could work around this by installing gnu parallel in the conda environment.
What you suggest @tsibley sounds good but realistically I won't do it.
I'm not sure how much maintenance there will have to be? We could go with this for now until we have one of the other alternatives you mention if you feel like you want to do this.
I initially tried parallel but it's less portable.
I mean, GNU parallel is a single-file (15k line!) Perl program using only the Perl stdlib, compatible with a wide range of Perl versions. It's very easy to ship a copy (and I've done so before ;-). But yes, I get your point.
What you suggest @tsibley sounds good but realistically I won't do it.
Yep, I'm not surprised.
I'm not sure how much maintenance there will have to be? We could go with this for now until we have one of the other alternatives you mention if you feel like you want to do this.
Going with this "for now" (really, indefinitely) is fine. Don't take my comments to mean this can't merge! There won't be much maintenance, probably, but there will be little bugs here and there (e.g. in the current pass, -v
/ --verbose
doesn't actually work) which will involve dealing with its approach. That's kinda my point: there's more code here than is necessary, and more code you don't need is always a liability.
Out of curiosity, this afternoon I wrote an alternative because I wanted to see what it'd look like with parallel
.
Upon writing the above comment this evening, I got curious again and decided to quickly implement the same output-filter approach to produce TAP instead so I could use it with prove
. It would still be better (and not difficult, I think) to add TAP output to Cram directly, but as-is even this little implementation works great.
Description of proposed changes
Cram tests are perfect for parallelization. Each test is independend and we have 178 of them. The wrapper script allows to run tests in parallel.
In my experiments, on an M1 Pro, total test time was reduced from 7m30s to 1m23s, a more than 5x speedup. I got best results with
-j8
(instead of all 10).The wrapper takes many of cram's options, one can still run tests of a single directory, for example
./parallel_cram.sh tests/functional/tree
.One caveat: it seems that iqtree creates files in the input file directory. As multiple tests use the same input files, they might conflict. So we should copy the input files to a temporary directory before running. This is done in commit https://github.com/nextstrain/augur/pull/1667/commits/687bd041446b1f060d21b79b48d3e00296dd69f0
We seem to be getting almost a 2x speedup in CI as well, as Github runners come with 2 cores by default. The bottleneck is now RSV pathogen CI which runs around 8 minutes (previously cram tests took 13min, now they are faster than RSV). Overall Github runner time is reduced from around 2h5min to 1h15min, a saving of around a third.
I tested the script to ensure that it correctly reports test failures. I've also been using it in my regular work on various PRs and it's worked exactly as expected, saving me waiting time.
Checklist