fix: #1629 improve API for use with dask

raylim commented 1 year ago

This makes the Slide Manifest (from slide_etl) the main data frame of record for a project. Additional columns are added for annotations, and the tiles_url column is modified when tiles are generated, tissue is detected, and tiles are saved in the h5 store. Dask is used at the slide and tile level with batches for parallelizing when possible.

Currently, the tile tissue inference doesn't work with Dask, as we need to use a Dask-ML compatible library, i.e., Skorch.

We also need some additional unit tests for the API functions.

armaank commented 11 months ago

Also, do you know if this resolves #407 ?

raylim commented 11 months ago

Also, do you know if this resolves #407 ?

I'm not sure. I would have to test it with the same slides that Darin used.

armaank commented 11 months ago

Great, thanks for making those changes!

msk-mind / luna

fix: #1629 improve API for use with dask #406