Closed tcompa closed 1 year ago
- [ ] Add tests for all (most?) cases. Note: there are basically three cases to test for each task: no existing output, fail due to existing output and overwrite=False, actually overwrite existing output. The first one is the one we are already testing, the second one should be somewhat easy (every time we run a task for the first time, we also re-run it right away with overwrite=False and catch the error).
This is now handled, in #499. For all tasks, I took at least one test (more than one, in some cases) and replicated the task-function call three times:
main
)overwrite=True
and all the other parameters equal to the first runoverwrite=False
(as part of a pytest.raises(OverwriteNotAllowedError)
block)This is already introducing some slow-downs of tests, since the task-execution speed doubles (or worse). For instance napari-workflow-related tests go from 82 seconds (in current main
) to 172 seconds (in current #499). I may actually remove some of the redundant tests, since the slow-down is severe.
The current set of tests is not even enough to cover some of the more complex scenarios (e.g. I run napari-workflows with two outputs, I remove one of the two from disk, I re-run with overwrite=False). I guess we cannot push testing too far, thus for the moment I'd be OK with the current trade-off.
On the bright side, note that the new functions introduced in #499 (the ones that currently belong to lib_zarr.py
) have 100% coverage.
Would there be a super basic napari workflow that could be run to test that? e.g. mock the workflow to just return a correctly shaped label img + table? If that runs fast, we could run the 3 scenarios with that and only run a real workflow once.
The current set of tests is not even enough to cover some of the more complex scenarios (e.g. I run napari-workflows with two outputs, I remove one of the two from disk, I re-run with overwrite=False). I guess we cannot push testing too far, thus for the moment I'd be OK with the current trade-off.
Makes sense. I'd say we focus on testing behaviors that are fully within Fractal, not trying to cover scenarios where users do external modifications.
On the bright side, note that the new functions introduced in https://github.com/fractal-analytics-platform/fractal-tasks-core/pull/499 (the ones that currently belong to lib_zarr.py) have 100% coverage.
That's great! Those are the functions we'd encourage others to depend on fractal-tasks-core for :)
Makes sense. I'd say we focus on testing behaviors that are fully within Fractal, not trying to cover scenarios where users do external modifications.
Agreed.
Would there be a super basic napari workflow that could be run to test that? e.g. mock the workflow to just return a correctly shaped label img + table? If that runs fast, we could run the 3 scenarios with that and only run a real workflow once.
What is somewhat annoying of the napari-workflows task is that we cannot easily mock the functions it calls internally (e.g. voronoi_otsu_labeling
). In principle we could create and install a fake Python package, with functions that can then be imported as steps in a napari-workflows. I won't go too far in this exploration, but it'd be conceptually very useful (and if it works then in the future we can also make the whole napari-workflows test module much faster).
Anyways, I think I was a bit overzealous since I made the from-one-to-three-runs change in multiple tests. If the mock-package doesn't work, maybe there is still a bit of easy-to-achieve speed-up (at least 172 s --> 144 s).
In principle we could create and install a fake Python package, with functions that can then be imported as steps in a napari-workflows. I won't go too far in this exploration, but it'd be conceptually very useful (and if it works then in the future we can also make the whole napari-workflows test module much faster).
Sounds conceptually interesting, but quite involved. Maybe a bit too much for this PR. Alternatively, could we "just" mock this line to provide a trivial output, i.e. never have it run an actual workflow, just return the trivial outputs object we need?
outputs = wf.get(list_outputs)
We would still want to test once to actually run the workflows (to test that our inputs are fed in correctly etc.), but wouldn't need to run this multiple times.
Sounds conceptually interesting, but quite involved. Maybe a bit too much for this PR.
I invested ~1h into this attempt, and got a napari-workflow test that uses a mocked regionprops and (locally) runs in 4 seconds instead of 32 seconds. If the GitHub CI works, it's a very good step forward. If not, I'll let it where it is and pick it up in the future, and I'll try to mock wf.get for now.
Ah, nice! 🤞🏻
If the GitHub CI works [..]
Apparently it does, see https://github.com/fractal-analytics-platform/fractal-tasks-core/actions/runs/6025871733/job/16347563138.
I'll merge the proof-of-principle branch into #499, and then use it only for testing the overwrite-related behavior for napari-workflows.
Using this tool more systematically to reduce CI times can be an idea for the future, but it's more complex (we'd need to be more careful to make sure that mocked functions return appropriate outputs).
This is already introducing some slow-downs of tests, since the task-execution speed doubles (or worse). For instance napari-workflow-related tests go from 82 seconds (in current main) to 172 seconds (in current https://github.com/fractal-analytics-platform/fractal-tasks-core/pull/499).
In #499 we are now down from 172 seconds to 92 seconds (locally), after moving the overwrite-related repeated runs into the test using a mocked regionprops package. That's close enough to the original 82 seconds, and I'll stop here with the optimization.
That's awesome! Agreed, that's enough optimization :)
Let's broaden the scope of #458 and #485, and introduce an
overwrite
parameter in most tasks.General notes:
overwrite
consistentlyoverwrite
argument (let's see whether it makes sens to have a custom error likeOverwriteNotAllowedError
with a standard error message, to be used in several tasks)overwrite_input
.To do: