AI4S2S / lilio

Calendar generator for machine learning with timeseries data
https://lilio.readthedocs.io/en/latest/
Apache License 2.0
5 stars 1 forks source link

Enable dask resampling #52

Closed BSchilperoort closed 1 year ago

BSchilperoort commented 1 year ago

This PR modifies the xarray resampling routine to allow for parallelized dask computation.

A quick guide on getting started with Lilio + Dask is added to the documentation. Linking to this from the resampling notebook would be nice, but currently the link would be dead (until this PR is merged).

BSchilperoort commented 1 year ago

Hey @geek-yang , would you want to go over the guide & try it out?

I do still want to add tests for dask, however this will require generating some dummy netcdf files.

geek-yang commented 1 year ago

Hey @geek-yang , would you want to go over the guide & try it out?

I do still want to add tests for dask, however this will require generating some dummy netcdf files.

Ok. Let me play with it first then. Thanks for the nice work 💯 !

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

BSchilperoort commented 1 year ago

Thanks for the review! I am not sure about a notebook, as configuring for dask is quite specific to the use-case, and we would need to have example data stored somewhere, as well as more complicated docs dependencies.

Currently I have added a hyperlink to the docs page on Dask. The docs page is also right next to the example notebooks.

sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

90.9% 90.9% Coverage
0.0% 0.0% Duplication

geek-yang commented 1 year ago

Thanks for the review! I am not sure about a notebook, as configuring for dask is quite specific to the use-case, and we would need to have example data stored somewhere, as well as more complicated docs dependencies.

Currently I have added a hyperlink to the docs page on Dask. The docs page is also right next to the example notebooks.

Ok sounds good. I think I can put a nice example in the recipe (I need to use it by any means) as indeed dask is specific and only for some advanced use cases with huge dataset. That's should be enough for now.