Closed eroell closed 4 months ago
weeeird, the pull_request_target
seems to still appear, even though we fixed this to use the pull_request
trigger in run_notebooks.yml.
But I can see the notebooks which should work triggered on pull_request
actually pass.
Might disappear once this is merged, lets see how this behaves in future PRs. Big thanks @flying-sheep @ilan-gold .
PR Checklist
docs
is updatedDescription of changes
Allow normalization methods to work with (dense) dask array. Suggest
ehrapy[dask]
for dependency management. Might make dask a dependency in the future.Technical details
sklearn.preprocessing.scale
) functions to the scikit-learn classes (e.g.sklearn.preprocessing.StandardScaler
) for preprocessing to be more synced with dask-ml (only has the classes option, e.g.dask_ml.preprocessing.StandardScaler
). No user facing effects, updated doc.Additional context
Example, profiled with scalene (run below python scripts as
scalene <scriptname>.py
, demonstrating howep.pp.scale_norm
does not trigger the computations and is not performance bottleneck:In memory (numpy) array
Out-of-memory (dask) array