fractal-analytics-platform / fractal-client

Command-line client for Fractal
https://fractal-analytics-platform.github.io/fractal-client
BSD 3-Clause "New" or "Revised" License
45 stars 1 forks source link

Image labeling at FOV level #64

Closed tcompa closed 2 years ago

tcompa commented 2 years ago

Hi @gusqgm and @jluethi, let's discuss here the details of the image labeling task.

jluethi commented 2 years ago

Great summary of the discussion at yesterday's meeting @tcompa :)

This suggests that:

  1. It is beneficial to add a little bit of same-GPU parallelism.
  2. This wont' scale to anything more than small batches of 2-3 sites. And for larger datasets (i.e. more Z planes), the maximum allowed batch_size could even be one.

To me, this would suggest that parallelization is interesting, but needs to either be a user parameter (easy) or we need to be able to estimate the load (hard). I'd go for a user parameter to start with. Then we can collect some experience what works well and maybe come up with good heuristics for a given model or such. Also, really good if this is something that can be varied in this task, because a user may e.g. decide to run cellpose at pyramid layer 1 (once this becomes possible using e.g. the ROI approach described above) and then could potentially run ~10 sites at once (given 10 Z layers) etc.

=> The flexibility is conceptually interesting, when and how we'll use it remains a bit of an open question though that we'll figure out more later I'd assume

tcompa commented 2 years ago

As of 60a27228f0fe3c2ebcf6ee0d1f317f36b9d798b6, the first image-labeling use case is roughly complete. It works for 2D or 3D images, always at the per-FOV level, and it has a relabeling=True/False argument. The labeling happens lazily over all FOV columns, and we can limit the number of concurrent cellpose executions with a num_threads argument. Relabeling is not done in a memory-efficient way, at the moment.

There are limits related to memory usage (different for labeling and relabeling) and runtimes (related to how many FOVs can be treated in parallel at the same time), which are better discussed in a new issue:

Issues related to the next use cases (whole-well segmentation in 2D and ROI-based scheme) are:

I'm closing this one.