Closed wbwakeman closed 3 years ago
@dougollerenshaw I wonder if this particular flavor of mismatch is caused by the trim_discontinuous_times
function since it happens right before that RuntimeError
that gets raised.
BehaviorOphysLimsApi.get_eye_tracking
--> sync_utilities.get_synchronized_frame_times
--> sync_utilities.trim_discontiguous_times
Thanks @njmei, that seems like a likely culprit. If I'm reading the function correctly, it looks like it'll trim off any sync pulses that come after a gap of 100 ms? If the videomon process hung mid-session, that could lead to a 100+ ms gap in the frame intervals.
Also, the visual_behavior_analysis function I've been using to read in sync files doesn't perform this operation, so that could explain the discrepancy.
Are there any realistic conditions where two discrete sync recordings are in the same file and separated by such a small interval? If not, I'd propose making the default for this function much larger. Or possibly passing an argument when loading visual_behavior sessions to override this small default?
@dougollerenshaw I don't think it would be a small interval. Currently, that function takes differences between subsequent timepoints and takes the median of those differences. Then it finds the first instance where the interval is 100x larger than the median interval and removes any sync timepoints after that interval.
So if the timepoints of the sync file (in ms) were: [1, 2, 3, 4, 5, 6, ...]
then in order for timepoints to be removed there would need to be a gap of 100 ms (since the median diff between subsequent times is 1).
If a lot of timepoints are being trimmed, the sync files would probably have to look something like: [1, 2, 3, 4, 5, 110, 111, 112, 113, 114, 115, 116, 117, etc..]
I would need to plot the timepoints in one of those example problem sync files to confirm this though.
Tentatively assigning this to Taylor and I to take on later this week.
After debugging the code with @kschelonka (for two of the given IDs) through to the error that Doug had been receiving, we noticed something: There is a method called trim_discontiguous_times
which takes the intervals, and if it finds a spike (any value that is 100 times the median by default) it deletes all data after that spike, even though it is a transient artifact.
We are unsure if this is the actual intent of the method as there is no docstring, however after searching for other uses of this method in the codebase, we found this comment next to one of the calls:
# occasionally an extra set of frame times are acquired after the rest of
# the signals. We detect and remove these
frame_times = sync_utilities.trim_discontiguous_times(frame_times)
It would be helpful if someone could let us know if this is the desired affect - do we want to lop off all interval values after a spike, or do we just want to remove the spike, or something else?
Attached are two plots of the interval arrays:
@dougollerenshaw I think it's pretty clear from the above that there's a transient artifact(s) in the data. The issue is that the function Taylor and Nick called out truncate the data at that point -- I believe incorrectly. Would be good to get some more guidance about these data and whether or not we should be attempting to remove these transient artifacts before the truncate function (trim_discontiguous_times
), or if it needs to be updated to ensure that it's trimmed at the end.
If you are able to locate any sessions that show a good example of the trim_discontiguous_times
being used properly, that could also help us. That is, an example where there was some kind of "discontiguous" recording.
@kschelonka and @Matyasz , thanks for digging deeper into this. I don't recall ever being involved with implementing the trim_discontiguous_times
function. Is it possible to look at the git blame to see who implemented it and whether the specific use case that warranted it still exists?
From what I understand about the sync program, I can see how there could have been a motivation to implement this. Without automation in place to stop sync at the end of a recording session, it's possible that an operator could inadvertently start a new session, the timestamps for which would be appended to the previous session's sync file. But the gap in that case would be very big - likely many tens of seconds - so it seems that this function could remain in place, but the default gap could be increased dramatically to avoid these false-positives.
Also, there is automation in place now that stops sync at the end of a session, further negating the need for this function.
@dougollerenshaw It's attributed to @NileGraddis, but I think it's a copy-paste since the commit is "715 move vsync trimming code to brain_observatory"
Ah, it's possible that this is necessary for ephys recordings. I think they may have more distinct sub-experiments that are run in quick succession with short breaks in between. I'll let @NileGraddis weigh in on the actual use case and whether or not lengthening the threshold for ID'ing distinct sessions would cause trouble.
I believe the other location we found this method used is in the ephys code - is this right Kat?
If this is the case, it would be no problem to keep the threshold as is for the ephys code and raise it for this code, given how the method already makes use fo default arguments. 🙂
Yes, I did see it in the ecephys sync processing. We could certainly update the threshold as ya'll mentioned, but I'm not sure what a good threshold would be...
Is there even a case where we need to worry about doing this clipping? If the sync ending is automated now, then we might not even need this for the eye tracing code anymore.
It could be a good extra just-to-be-safe check to keep something like this though, with...a 1000x multiplier or something, yeah choosing a new threshold would be tough. So I'd be in favor of scrapping it from the eye tracking code if there aren't cases where we want it anymore.
Hey @dougollerenshaw, do you happen to have an example of an experiment ID where this issue doesn't occur? It could be really helpful for us when proposing a full solution to see a case where this trim_discontiguous_times
function is actually useful (assuming that there is a case where this is needed).
Hi @Matyasz, as far as I know, the trim_discontiguous_times
function isn't being successfully applied to any ophys data. If you look at any session other than the ones that I identified as failures above, those will be examples of sessions where the function is being applied, but there are no gaps large enough to trigger it splitting the sync file.
In order to complete this issue properly, we need to get a better set of experiments that exhibit the potential cases.
Acceptance criteria:
Based on information from item 5 in #1545, need to understand the mismatch between the count returned by sync counts on load, and in eye_tracking file. https://github.com/AllenInstitute/AllenSDK/issues/1545#issuecomment-666099404 experiment_ids: [850479305, 878358326, 969809655, 992868972, 994791582]
Sync mismatch on load, but actual sync count appears fine These are sessions for which the SDK fails to load the eye_tracking attribute and reports a mismatch in the eye video frame count and the eye tracking sync count. However, when I independently checked the sync count (using https://github.com/AllenInstitute/visual_behavior_analysis/blob/62177356f8690f3565f5559ed4aafdeaf45b5f5c/visual_behavior/utilities.py#L367, reported as the EyeTracking_sync_pulse_count column in the tables shown here), they appear to match.
Need to identify the reason that the SDK appears to be mis-counting the sync TTL pulses for these 5 sessions. If the sync pulses were counted correctly, these 5 sessions would open without error.
plain text experiment_ids: [850479305, 878358326, 969809655, 992868972, 994791582]
And here's the traceback when opening one of these experiments:
For these sessions, I propose that someone on the Technology team check to see why the SDK sync count appears to be incorrect.