adding robust signal/noise, skew and tests

arielleleon commented 8 months ago

I ported over the robust noise and signal computation from visual_behavior_analysis. Wrt to the robust_noise calculation - it looks like there may be some overlap between what Johannes has added vs what visual_behavior_analysis does. I am thinking we can combine these given that I will be moving the df / f calculation from this package into Code Ocean. The current ophys_etl df / f is very slow.

My questions are:

Do we want to add the robust signal and skew metrics to Johannes' current df / f algo or do we want to run Johannes' df / f module and run robust signal and skew post process from the capsule? I am in favor of the latter but I am curious what you two think
The noise_std that Johannes does is different than the one from the vba package. Is there any code that Johannes wrote that we could use instead of vba?

Thanks!

j-friedrich commented 8 months ago

The dff_robust_noise(dff_trace) shares some similarity with the 'mad' method already implemented in noise_std that i took from ophys_etl's dff. I found the VBA version to be less robust than any of the 3 methods ('mad', 'fft', 'welch') already implemented and would just use any of those instead.
(The sensitivity analysis in the figure shows that the other methods are always closer to the correct green line)

Don't see a reason to define copmute_skew (sic) which merely calls scipy.stats.skew and adds a bunch of lines documentation.

compute_robust_snr_on_dataframe is highly specific, nobody will reuse this function, thus would just delegate it to the capsule.

matchings commented 8 months ago

I'm good with using @j-friedrich 's better noise estimate. Thanks for the analysis showing that it is more accurate. Would we also be replacing the signal calculation with @j-friedrich version?

I know it's only one line of code to compute skew, but having it there so that the capsule computes it and includes it as a metric in the processing.json, and as a plot of distribution of skew values in the results for each experiment, allows us to use this for QC on a routine basis in the same way we will QC all other metrics output by processing. Whether this goes in aind-ophys-utils or only in the capsule itself doesn't matter to me.

arielleleon commented 8 months ago

Thank you both for comments - I am happy to go with the noise estimation used in Johannes' method.

Echoing what Marina said - adding skew to a function was a way of making it obvious how we were computing skew (assuming there are other ways to do it - which I could be totally wrong). It's nice to be able to grab the skew of the trace and pack it in with the dff trace like the vba package did with noise and signal. We can pick and choose in different versions of the pipeline which analyses we package with the data...idk just spit balling here.

j-friedrich commented 8 months ago

Oh, I was never opposed to computing skewness, am all for computing it as QC measure.

I just don’t see the need nor benefit to define our own function that doesn’t add any functionality to what’s already perfectly there in scipy. Why is scipy.stats.skew(dff_trace, nan_policy="omit") worse than aind_ophys_utils.signal_utils.copmute_skew(dff_trace)?

matchings commented 8 months ago

i agree that just using the scipy function makes more sense, no need for a special function for it

arielleleon commented 8 months ago

You got it guys I will pull it out XD

arielleleon commented 8 months ago

@matchings @j-friedrich

Can you both tell me how this plan sounds? I will close this PR and delete the branch. Create a new capsules using @j-friedrich new df / f calculation with his median filter and noise calculation. Within the capsule, I will also append the skewness of the cell signals to the dff.h5.

j-friedrich commented 4 months ago

Closing, as this is now handled in the dff capsule

AllenNeuralDynamics / aind-ophys-utils

adding robust signal/noise, skew and tests #50