Add optional parameters `plot_path` and `threshold_FWHM` to `focus_from_transverse_band`

talonchandler commented 12 months ago

Summary This PR fixes #129 and adds:

an optional plot_path option...passing a path string will save a diagnostic plot to that path (.png, .svg, .pdf etc)
an optional threshold_FWHM option (default 0) will make the function return an in-focus slice if the mid-band power peak has a FWHM larger than the threshold...otherwise the function will return None.

Rationale: After my initial conversation with @ieivanov, I looked for a way to "normalize" the mid-band power metric so that we could try using an absolute threshold. I ended up convincing myself that finding an absolute threshold would be challenging.

For example, consider an in-focus object that fills the FOV then the same object that fills a small part of the FOV (padded with zeros/noise). If we normalize by the number of pixels in the FOV then the mid-band power appears to change between these two objects, when we'd like to focus on them both equally well. I played with a few alternative normalizations, and ultimately decide to try a different approach.

I ended up settling on the threshold_FWHM approach implemented here. The rationale is that noise will typically have peaks with smaller FWHM than real axially-extended in-focus objects, so we can reject "in-focus noise" by thresholding on the FWHM. Larger FWHM peaks are also associated with peaks that have tighter sampling (i.e. we have more confidence that the object is there) and that are away from the edges (a peak at the very edge has zero FWHM by definition in the current implementation).

Performance on real data Caption: four focus-finding diagnostic plots generated with real data (2023-07-03). Rows: two positions. Columns: whole FOV and a hand-selected background region filled with noise. Green dots indicate an in-focus peak returned by the function, while red dots indicate a rejected peak (the function returns None).

Performance on synthetic data Caption: an SNR sweep on simulated data. On average, the peak width for an in-focus object decreases as the SNR decreases, so thresholding on the peak width is an appropriate strategy.

Comment on (lack of) danger to hardware This approach uses data that's already been acquired to estimate in-focus positions, so it does not introduce any additional danger of crashing. Increasing the threshold_FWHM increases the safety since it reduces the likelihood of choosing slices that are at the edge of the axial FOV, which reduces the chance of repeated run leading to a crash.

Next steps @ieivanov if you find that threshold_FWHM does not provide the discriminatory power you need, we can iterate and try different threshold/peak finders.

ieivanov commented 11 months ago

Thanks @talonchandler! I'll give this a try and will report back.

As a possible next iteration - I think the MATLAB findpeaks does a very good job. Peak prominence is a metric which relates the peak FWHM and the peak height above the "background".

This seems to be a scipy implementation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html

talonchandler commented 11 months ago

Roger that, thanks!

FWIW here I'm using scipy.peak_widths which can be easily swapped out for scipy.peak_prominences or whatever other metric you think we should try on a single peak.

The function you mention scipy.find_peaks seems to give an unranked list of all of the peaks that meet the criteria you give it, so it might not be directly applicable here.

ieivanov commented 11 months ago

You're correct.

I've been thinking some more about this, and it seems to me that the peak prominences may not be the best metric as it would prioritize tall and skinny peaks. In our application we want features which come in and out of focus slowly, as you said, and thus have a wide FWHM. I think something like the peak "area" - width * abs_height, would be a better metric that will keep us away from close calls like in the bottom right. I'll test today and will report back.

I think the strategy of finding the highest peak and evaluating peak metrics, as we do now, is right, rather than using peak_find.

ieivanov commented 11 months ago

I've convinced myself that the FWHM is the right parameter here. Let's merge this! Thanks again.

mehta-lab / waveorder

Add optional parameters `plot_path` and `threshold_FWHM` to `focus_from_transverse_band` #132