funkelab / daisy

Block-wise task scheduling for large nD volumes.
MIT License
25 stars 16 forks source link

Update tests.yaml #49

Closed rhoadesScholar closed 1 week ago

rhoadesScholar commented 4 months ago

Add platform tests

pattonw commented 3 months ago

Getting daisy to pass tests on macOS and Windows will probably be pretty challenging. The main blocker being the transition from multiprocessing start method "fork" to multiprocessing start method "spawn". The main problems that would need to be fixed: 1) No more lambda functions, in basically any multiprocessing context. This means almost all lambda functions other than those in the scheduler/dependency graph need to be removed. 2) No local functions. This is a bit harder to remove. Something like this is a pretty reasonable function to expect from someone using daisy:

    def create_task(worker_file):

        def start_worker():
            multiprocessing.run("python", worker_file)

        task = daisy.Task(
            "test_task",
            total_roi=Roi((0,), (42,)),
            read_roi=Roi((0,), (10,)),
            write_roi=Roi((1,), (8,)),
            process_function=start_worker,
            check_function=None,
            read_write_conflict=True,
            fit="valid",
            num_workers=num_workers,
            max_retries=2,
            timeout=None,
        )
However with start_method = "spawn", this fails. The `start_worker` function is local to the `create_task` function and thus can't be pickled. You would have to move the functions around to have two top level functions. However this means you have to pass the variable `worker_file` in as an argument like so:
```python
def start_worker(worker_file):
    multiprocessing.run("python", worker_file)

def create_task(worker_file):
    task = daisy.Task(
        "test_task",
        total_roi=Roi((0,), (42,)),
        read_roi=Roi((0,), (10,)),
        write_roi=Roi((1,), (8,)),
        process_function=start_worker(worker_file),
        check_function=None,
        read_write_conflict=True,
        fit="valid",
        num_workers=num_workers,
        max_retries=2,
        timeout=None,
    )
```
However this also doesn't work because now you are no longer passing in an executable, this is just starting a worker and passing None to the `process_function`. The only solution I can come up with would be to adapt daisy to use an api more similar to the multiprocessing `Process` api where you define your target function and args seperately: `Process(target=f, args=('bob',))`. This would be more annoying to do for the `check_function` and `process_function` args than simple lambdas.

3) I think we have to remove double underscore methods. def __do_something(self), defines a function that gets mangled and must be called with self._MyClass__do_something(). Pickling often seems to fail to call double underscore methods because it tries to call them as written, and does not do the appropriate mangling. 4) After dealing with the first 3, I still run into cryptic "FAILED tests/test_server.py::TestServer::test_basic - TypeError: cannot pickle 'module' object" errors. I don't know where they come from but that also needs to be solved.

cmalinmayor commented 3 months ago

@rhoadesScholar How important is supporting Windows or MacOS for DaCapo?