msk-mind / luna

Scripts for data processing
https://msk-mind.github.io/luna
Apache License 2.0
41 stars 10 forks source link

fix: #1629 improve API for use with dask #406

Closed raylim closed 11 months ago

raylim commented 1 year ago

This makes the Slide Manifest (from slide_etl) the main data frame of record for a project. Additional columns are added for annotations, and the tiles_url column is modified when tiles are generated, tissue is detected, and tiles are saved in the h5 store. Dask is used at the slide and tile level with batches for parallelizing when possible.

Currently, the tile tissue inference doesn't work with Dask, as we need to use a Dask-ML compatible library, i.e., Skorch.

We also need some additional unit tests for the API functions.

armaank commented 11 months ago

Also, do you know if this resolves #407 ?

raylim commented 11 months ago

Also, do you know if this resolves #407 ?

I'm not sure. I would have to test it with the same slides that Darin used.

armaank commented 11 months ago

Great, thanks for making those changes!