JuliaTesting / ReTest.jl

Testing framework for Julia
Other
108 stars 6 forks source link

Limited parallelism with many modules #46

Open danielsoutar opened 2 years ago

danielsoutar commented 2 years ago

As a user, I might want to split my tests up into multiple files, corresponding to separate modules. I would do this for separation of concerns, better organization, and reduce polluting the global namespace (e.g. with shared data or set-up/tear-down associated with several specific testsets). However, ReTest discourages this practice due to its current model of parallelism.

Consider the following example:

# runtests.jl

using Distributed

N_WORKERS = 2

addprocs(N_WORKERS, exeflags="--project=$(Base.active_project())")

# Propagate importing ReTest across all workers
@everywhere using ReTest

# Assume that each module_x.jl defines a `ModuleX` in the global namespace.
@everywhere include("path/to/module_1.jl")
@everywhere include("path/to/module_2.jl")
...
@everywhere include("path/to/module_M.jl")

# Call ReTest from the main process, and dispatch the workers across the testsets.
retest(Module1, Module2, ..., ModuleM; ...)

If M=100, and each ModuleX defined a single testset, the user would reasonably assume that ReTest would achieve reasonable parallelism and approximately halve the time taken due to having 2 workers. This is not the case, as ReTest only dispatches the workers across the testsets of a single module. This then means that enabling parallelism would have absolutely no effect at all, and due to overhead would simply be slower. On top of that, shuffling would also have no effect, as it also only works per-module, and so any hidden dependencies between modules would never be detected. This code block in src/ReTest.jl in the source confirms this:

for imod in eachindex(modules)
        mod, pat = modules[imod]
        tests = alltests[imod]
        ...
        shuffle &&
            shuffle!(tests)
        ...
        # TODO: move printer task out of worker?
        worker = @task begin
        ...

While this is fine and can be worked around (e.g. consolidating tests into a single file), this seems like an unnecessary restriction that the library imposes. Instead, it would be better if all of the testsets across all modules were gathered into a single global set that can then be divided across the workers. In this model of parallelism, the previewer task would work in a similar way to as it does currently, going through the testsets in the order that they appear to retest. Instead of having the workers follow in lockstep, they would simply return the results to the previewer process as and when they complete their testsets. When the previewer reaches a testset and the result is there, it immediately evaluates it and displays it. If it doesn't, it waits on the result until one of the workers completes it.

This has several advantages:

I'd be happy to contribute towards this, as this seems like a significant improvement in flexibility for the developer and would go a long way towards making this a package that can be used for production.

rfourquet commented 2 years ago

Thanks for opening this issue, these are all very good points! I haven't these limitations myself yet, so your feedback coming from real usecase is very useful. I will try to think about it in the upcoming weeks and discuss with you here on a possible path forward.