Closed grst closed 10 months ago
Attention: 4 lines
in your changes are missing coverage. Please review.
Comparison is base (
bd168a9
) 80.52% compared to head (5255317
) 80.50%.
Files | Patch % | Lines |
---|---|---|
src/scirpy/util/__init__.py | 62.50% | 3 Missing :warning: |
src/scirpy/ir_dist/metrics.py | 91.66% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
In general this works well (also out-of-machine computing with dask), but:
verbosity
to ~5 to get some updates printed to the console. To make this nicer, maybe I can check if the backend is supported. for some reason it takes ~15 seconds to spawn the loky worker threads. When rerunning the function, this is instant. Not sure what the threads are doing initially.
the reason was that the worker threads all import scirpy from scratch. It turned out to be quite slow because of import airr
. I now deferred importing the airr
package until it's needed (only in some IO functions) and this brought down the import time to ~1s which is acceptable and mostly due to scanpy
.
@felixpetschko fyi
In this PR I change how the ParallelDistanceCalculator
handles parallel processing. It now uses joblib.Parallel that (amongst others) allows to use dask
as a backend for multi-machine computing.
But I think your reimplementation of the hamming distance calculator anyway doesn't need this.
Closes #468