AllenInstitute / brain_observatory_qc

Other
2 stars 0 forks source link

Variability in duration between image presentation #30

Closed alexpiet closed 10 months ago

alexpiet commented 3 years ago

Describe the bug I found a scientifica session in the release data that has high variability between image presentations. The duration between image presentations should be 750ms. Some variability happens due to dropped frames and timing alignments for the computer generating the stimulus and the monitor. This session has 54 images with a duration either longer than 800ms or shorter than 700ms.

CAM2P.3 OPHYS_3_images_A 2019-04-24 Slc17a7-IRES2-Cre V1, 375um depth OEID: 856938751 OSID: 856295914

This is what a typical session looks like: good_session

This is the bad session: bad_session

To Reproduce

session.stimulus_presentations['diff'] = session.stimulus_presentations.start_time.diff() plt.hist(session.stimulus_presentations['diff'], bins=100) plt.yscale('log') plt.axvline(0.75,'r')

Solution I dont know what the threshold should be for how much variability we tolerate. I feel this session is past that threshold. I think part of the QC report could generate the figure above, or at least count how many images are outside of a +/-50ms window of expected inter-image durations.

Additional context SDK issue requesting the session gets flagged for removal: https://github.com/AllenInstitute/AllenSDK/issues/2252

DowntonCrabby commented 3 years ago

@alexpiet Do you have any estimate of how frequently this issue occurs? Were you looking through a large group of data systematically and this was the only case of this that you found or was it more like a one-off where you were looking at something else and thought something was weird and encountered this issue?

alexpiet commented 3 years ago

I checked systematically for sessions where the duration between image presentations was less than 750-0.5*(duration-between-ophys-frame-rate) (ms).

About 74 sessions in the platform dataset have at least one image that fails that test.

Only this session has more than 5 that fail that test.

I'm not sure what the criteria should be for a QC fail, but I think this example sessions is clearly an outlier.

DowntonCrabby commented 3 years ago

Thanks, this is super helpful information. Do you happen to have a list of those 74 sessions so we can begin looking at them and see if this is related to the overall stimulus timing or if it's another issue entirely? It would help us to be able to determine thresholds I think.

alexpiet commented 3 years ago

Sorry, I didnt save that list of 74 sessions. I should say it was 74 experiments, some of which were probably multiple experiments from the same mesoscope session.

DowntonCrabby commented 3 years ago

To add more context, here is the stimulus timing report for the session Alex outlined above 856295914_StimulusTimingReport

To me this suggests that this issue is related to the stimulus timing being improper due to long frames, and is further evidence that we need a threshold for "Visual Stim Dropped Events"- which we don't have threshold for currently.

I will try to find the other 74 experiments where this occurred and determine if the stimulus frame timing is similarly poor with those.

DowntonCrabby commented 3 years ago

Okay here is a rather big update on when & how this issue occurs. As a caveat: I only looked at experiments that had been released, meaning they passed QC. This obviously biases the data here and doesn't give us a true sense of the frequency & severity of this issue overall, but only as it relates to the datasets we've already released.

I went through all the sessions in the ophys_session_table from the sdk and here is what I found:

Here is the distribution of these failures, as you can see the majority of cases just have a single image presentation outside our acceptable range, and you also see the large outlier that Alex flagged.

Issue30_hist_of_presentations_outside_range

better visualized with basic count plot: Issue30_countplot

Histogram of the diffs: Issue30_hist_of_diffs

zoomed in portions on the low and high ends Issue30_hist_of_diffs_lowEnd

Issue30_hist_of_diffs_highEnd

What is happening with those with a diff of 300? Looks like when they occur it's always around the same time in the stimulus... Issue30_difference_above_300

Here is the breakdown by rig: Note: of the impacted sessions 44 are MESO.1 but that implicates 241 experiments. Issue30_countplot_by_rig

Issue30_number_outside_threshold_per_session_and_rig

I'm working on a better timeseries plot but here is what I have so far:

outside_threshold_by_date

plots on all of the distributions for sessions with 1 or image presentations are saved here: \allen\programs\braintv\workgroups\neuralcoding\Behavior\qc_plots as are csvs with the underlying data for all the plots.

I'll upload the analysis jupyter notebook soon

DowntonCrabby commented 2 years ago

Notes from 10/25/21 QC ops meeting: The issue with the 300ms diff between image presentations seems to occur at the ~5min mark of the session, during the transition point between the gray screen & image discrimination task beginning.

Creating a metric to count the number of image presentations that's outside of the +- 50ms window, and also to recreate the plot that Alex made (hist of distribution time between image presentations) is easy enough but we need to determine meaningful thresholds for these before we implement them. Perhaps this is something that the stakeholders scientific group should determine as they know how these issues would disrupt their analysis.

@alexpiet do you have recommendations of who might be best to help establish thresholds for this issue?

alexpiet commented 2 years ago

The 300ms issue is likely due to a weird bug that happens where sometimes camstim logs an "omission" at the start of the 5 minute gray screen. So then when the first real stimulus is presented at 5 minutes in (~300seconds), it shows up as a 300s difference.

As far as determining useful thresholds: Shawn, Marina?

DowntonCrabby commented 2 years ago

From a discussion with Marina and I, @matchings suggests setting the threshold at 5 for # of image presentations outside the window.

I think we still need to investigate this issue of 300 second difference to be sure its just a weird bug and not a bigger issue, and also if there are some length of time differences that would also mean a failure (not just the number of times it's outside the range)

DowntonCrabby commented 2 years ago

11/3/21 stakeholder meeting update: need threshold on how large the window of time can be before a flag or failure should occur.

matchings commented 10 months ago

As Alex described above, this issue appears to be a result of omissions happening as the first stimulus in the change detection task. This should be accounted for in the stimulus presentations table, and thus is not an issue IMO. Closing.