Closed alexpiet closed 10 months ago
@alexpiet Do you have any estimate of how frequently this issue occurs? Were you looking through a large group of data systematically and this was the only case of this that you found or was it more like a one-off where you were looking at something else and thought something was weird and encountered this issue?
I checked systematically for sessions where the duration between image presentations was less than 750-0.5*(duration-between-ophys-frame-rate) (ms).
About 74 sessions in the platform dataset have at least one image that fails that test.
Only this session has more than 5 that fail that test.
I'm not sure what the criteria should be for a QC fail, but I think this example sessions is clearly an outlier.
Thanks, this is super helpful information. Do you happen to have a list of those 74 sessions so we can begin looking at them and see if this is related to the overall stimulus timing or if it's another issue entirely? It would help us to be able to determine thresholds I think.
Sorry, I didnt save that list of 74 sessions. I should say it was 74 experiments, some of which were probably multiple experiments from the same mesoscope session.
To add more context, here is the stimulus timing report for the session Alex outlined above
To me this suggests that this issue is related to the stimulus timing being improper due to long frames, and is further evidence that we need a threshold for "Visual Stim Dropped Events"- which we don't have threshold for currently.
I will try to find the other 74 experiments where this occurred and determine if the stimulus frame timing is similarly poor with those.
Okay here is a rather big update on when & how this issue occurs. As a caveat: I only looked at experiments that had been released, meaning they passed QC. This obviously biases the data here and doesn't give us a true sense of the frequency & severity of this issue overall, but only as it relates to the datasets we've already released.
I went through all the sessions in the ophys_session_table from the sdk and here is what I found:
Here is the distribution of these failures, as you can see the majority of cases just have a single image presentation outside our acceptable range, and you also see the large outlier that Alex flagged.
better visualized with basic count plot:
Histogram of the diffs:
zoomed in portions on the low and high ends
What is happening with those with a diff of 300? Looks like when they occur it's always around the same time in the stimulus...
Here is the breakdown by rig: Note: of the impacted sessions 44 are MESO.1 but that implicates 241 experiments.
I'm working on a better timeseries plot but here is what I have so far:
plots on all of the distributions for sessions with 1 or image presentations are saved here: \allen\programs\braintv\workgroups\neuralcoding\Behavior\qc_plots as are csvs with the underlying data for all the plots.
I'll upload the analysis jupyter notebook soon
Notes from 10/25/21 QC ops meeting: The issue with the 300ms diff between image presentations seems to occur at the ~5min mark of the session, during the transition point between the gray screen & image discrimination task beginning.
Creating a metric to count the number of image presentations that's outside of the +- 50ms window, and also to recreate the plot that Alex made (hist of distribution time between image presentations) is easy enough but we need to determine meaningful thresholds for these before we implement them. Perhaps this is something that the stakeholders scientific group should determine as they know how these issues would disrupt their analysis.
@alexpiet do you have recommendations of who might be best to help establish thresholds for this issue?
The 300ms issue is likely due to a weird bug that happens where sometimes camstim logs an "omission" at the start of the 5 minute gray screen. So then when the first real stimulus is presented at 5 minutes in (~300seconds), it shows up as a 300s difference.
As far as determining useful thresholds: Shawn, Marina?
From a discussion with Marina and I, @matchings suggests setting the threshold at 5 for # of image presentations outside the window.
I think we still need to investigate this issue of 300 second difference to be sure its just a weird bug and not a bigger issue, and also if there are some length of time differences that would also mean a failure (not just the number of times it's outside the range)
11/3/21 stakeholder meeting update: need threshold on how large the window of time can be before a flag or failure should occur.
As Alex described above, this issue appears to be a result of omissions happening as the first stimulus in the change detection task. This should be accounted for in the stimulus presentations table, and thus is not an issue IMO. Closing.
Describe the bug I found a scientifica session in the release data that has high variability between image presentations. The duration between image presentations should be 750ms. Some variability happens due to dropped frames and timing alignments for the computer generating the stimulus and the monitor. This session has 54 images with a duration either longer than 800ms or shorter than 700ms.
CAM2P.3 OPHYS_3_images_A 2019-04-24 Slc17a7-IRES2-Cre V1, 375um depth OEID: 856938751 OSID: 856295914
This is what a typical session looks like:
This is the bad session:
To Reproduce
Solution I dont know what the threshold should be for how much variability we tolerate. I feel this session is past that threshold. I think part of the QC report could generate the figure above, or at least count how many images are outside of a +/-50ms window of expected inter-image durations.
Additional context SDK issue requesting the session gets flagged for removal: https://github.com/AllenInstitute/AllenSDK/issues/2252