spacetx / starfish

starfish: unified pipelines for image-based transcriptomics
https://spacetx-starfish.readthedocs.io/en/latest/
MIT License
223 stars 67 forks source link

Function to find the z planes that are in focus #798

Open kevinyamauchi opened 5 years ago

kevinyamauchi commented 5 years ago

Based on some discussion the starfish-dev slack channel, it sounds like there could be some interest in a function that computes a focus metric across z to determine which z planes are in focus and returns an ImageStack that contains just the in focus tiles. A function such as this could greatly reduce computation time, as it is common to acquire a large z range to account for focus drift, etc.

I think this could either be a standalone pipeline component or a method of ImageStack. Perhaps the method could have a couple of simple built in focus metrics, but also optionally accept a function as an input argument. I would be up for writing something like this, as it would definitely save us time. Thoughts?

dganguli commented 5 years ago

This would be a fantastic contribution!

I like the idea of it being implemented as a pipeline component that extends FilterAlgorithmBase. My reasoning here is twofold: 1. if auto-focus will be a step in a pipeline, it should have a CLI version (for workflow runner compatibility) which you get for free by implementing FilterAlgorithmBase. 2. ImageStack is really about constructors, setters, getters, and visualization of an FOV -- it's not meant for doing canonical image-processing w/ starfish.

Regarding 2, ImageStack.max_proj is a bit of an oddity. It's part of ImageStack (because that seems natural, it's also useful for visualization, and several other pieces of the codebase use it as a convenience method). But, it's also a part of FilterAlgorithmBase (because some pipelines simply throw away z-planes by taking a max proj) and thus it needs a CLI representation.

What if your auto_focus pipeline component implemented max projection as a special case? Then that cleans up the abstraction leakage around having max_proj be both in the CLI and in ImageStack?

FYI @shanaxel42

ambrosejcarr commented 5 years ago

@kevinyamauchi are there any focus metrics that you were thinking about here? It would be helpful for me to be able to test them on SeqFISH (even outside a pipeline component).

berl commented 5 years ago

I've had some success with taking the max per plane of filters.gaussian_laplace from scipy.ndimage. When I use sigma around 1 and plot vs. z it correlates very well with the number of spots found.

kevinyamauchi commented 5 years ago

Hey. Sorry, I lost track of this.

As @berl mentioned, using spatial frequency filters similar to the ones used to enhance the spots is a good approach. Another classic for estimating image sharpness is the variance of the laplacian.

I often get good results with something simple like the coefficent of variation or standard deviation of intensity in each z slice. This works well when the foreground pixels are a reasonable fraction of the field of view (e.g., the DAPI channel) . This doesn't work as well for finding focus in when the foreground pixels are a small fraction of the field of view (e.g., FISH spots). However, we could use the DAPI channel to find the indices to .sel()

ambrosejcarr commented 5 years ago

ImageJ has a plugin to do this that we could learn from: https://imagej.net/Microscope_Focus_Quality

kevinyamauchi commented 5 years ago

Oh wow, I totally forgot about this. I still think this is a good idea :P

Micromanager also has several image-based autofocus systems that we could likely implement as well.

ttung commented 5 years ago

Is it likely that we have many zplanes that are entirely useless? I wonder if it's possible to break each image into regions, calculate how in-focus each region is (contrast detection?), and combine them. Kind of like what Google is doing with its cameras.

ambrosejcarr commented 5 years ago

We are likely to have many z-planes that are entirely useless.

berl commented 5 years ago

I agree that there are likely to be out-of-focus planes with no data, but throwing the raw data away should be done very carefully so that:

  1. we don't delete something important
  2. we don't confuse ourselves later when one round or channel has fewer/different z positions than another in the same or different FOVs.

also, I hope we don't need to add tensorflow as a dependency for this purpose.

ttung commented 5 years ago

I would assume that we would just skip further processing with the planes that don't have useful information. starfish should never overwrite its inputs.

What is the concern with tensorflow as a dependency?

berl commented 5 years ago

What is the concern with tensorflow as a dependency?

bloat.

using focus data as a mask (or whatever you want to call it ) to skip further processing could be very useful and save compute time. would it be able to save memory ?

ttung commented 5 years ago

bloat.

What if it's an optional dependency?

using focus data as a mask (or whatever you want to call it ) to skip further processing could be very useful and save compute time. would it be able to save memory ?

It depends on the processing workflow. The most obvious strategy would be to save the relevant tiles to disk, and load them back up as a new ImageStack. In that scenario, we would indeed save memory.

Alternate strategies to require significant complexity or a phase in the compute pipeline where we would need to keep both the unpruned and the pruned datasets in memory.