Clarify eye sync mismatch cases and propose solution

wbwakeman commented 4 years ago

In order to complete this issue properly, we need to get a better set of experiments that exhibit the potential cases.

[ ] Contact stakeholders to get a experiment where this function is required and would be used to modify the experiment as expected
[ ] Use this example and the examples listed below to demonstrate the proposed changes
[ ] Get sign-off from stakeholders on the changes and the effects of implementing the change.

Acceptance criteria:

[ ] Propose a solution to the problem if applicable, and get stakeholder approval of the plan.

Based on information from item 5 in #1545, need to understand the mismatch between the count returned by sync counts on load, and in eye_tracking file. https://github.com/AllenInstitute/AllenSDK/issues/1545#issuecomment-666099404 experiment_ids: [850479305, 878358326, 969809655, 992868972, 994791582]

Sync mismatch on load, but actual sync count appears fine These are sessions for which the SDK fails to load the eye_tracking attribute and reports a mismatch in the eye video frame count and the eye tracking sync count. However, when I independently checked the sync count (using https://github.com/AllenInstitute/visual_behavior_analysis/blob/62177356f8690f3565f5559ed4aafdeaf45b5f5c/visual_behavior/utilities.py#L367, reported as the EyeTracking_sync_pulse_count column in the tables shown here), they appear to match.

Need to identify the reason that the SDK appears to be mis-counting the sync TTL pulses for these 5 sessions. If the sync pulses were counted correctly, these 5 sessions would open without error.

plain text experiment_ids: [850479305, 878358326, 969809655, 992868972, 994791582]

And here's the traceback when opening one of these experiments:

from allensdk.brain_observatory.behavior.behavior_ophys_session import BehaviorOphysSession
oeid = 850479305
session = BehaviorOphysSession.from_lims(oeid)
session.eye_tracking
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-80-965d2b9af6bb> in <module>
      2 oeid = 850479305
      3 session = BehaviorOphysSession.from_lims(oeid)
----> 4 session.eye_tracking

~/Code/AllenSDK/allensdk/brain_observatory/behavior/behavior_ophys_session.py in eye_tracking(self)
    369 
    370         if (self._eye_tracking is None) or self.needs_data_refresh(params):
--> 371             self._eye_tracking = self.api.get_eye_tracking(z_threshold=self._eye_tracking_z_threshold,
    372                                                            dilation_frames=self._eye_tracking_dilation_frames)
    373             self.clear_updated_params(params)

~/Code/AllenSDK/allensdk/internal/api/behavior_ophys_api.py in get_eye_tracking(self, z_threshold, dilation_frames)
    302             sync_line_label_keys=Dataset.EYE_TRACKING_KEYS)
    303 
--> 304         eye_tracking_data = process_eye_tracking_data(eye_tracking_data,
    305                                                       frame_times,
    306                                                       z_threshold,

~/Code/AllenSDK/allensdk/brain_observatory/behavior/eye_tracking_processing.py in process_eye_tracking_data(eye_data, frame_times, z_threshold, dilation_frames)
    191 
    192     if len(frame_times) != len(eye_data.index):
--> 193         raise RuntimeError(f"Error! The number of sync file frame times "
    194                            f"({len(frame_times)} does not match the "
    195                            f"number of eye tracking frames "

RuntimeError: Error! The number of sync file frame times (124540 does not match the number of eye tracking frames (271567)!

For these sessions, I propose that someone on the Technology team check to see why the SDK sync count appears to be incorrect.

njmei commented 4 years ago

@dougollerenshaw I wonder if this particular flavor of mismatch is caused by the trim_discontinuous_times function since it happens right before that RuntimeError that gets raised.

BehaviorOphysLimsApi.get_eye_tracking --> sync_utilities.get_synchronized_frame_times --> sync_utilities.trim_discontiguous_times

dougollerenshaw commented 4 years ago

Thanks @njmei, that seems like a likely culprit. If I'm reading the function correctly, it looks like it'll trim off any sync pulses that come after a gap of 100 ms? If the videomon process hung mid-session, that could lead to a 100+ ms gap in the frame intervals.

Also, the visual_behavior_analysis function I've been using to read in sync files doesn't perform this operation, so that could explain the discrepancy.

Are there any realistic conditions where two discrete sync recordings are in the same file and separated by such a small interval? If not, I'd propose making the default for this function much larger. Or possibly passing an argument when loading visual_behavior sessions to override this small default?

njmei commented 4 years ago

@dougollerenshaw I don't think it would be a small interval. Currently, that function takes differences between subsequent timepoints and takes the median of those differences. Then it finds the first instance where the interval is 100x larger than the median interval and removes any sync timepoints after that interval.

So if the timepoints of the sync file (in ms) were: [1, 2, 3, 4, 5, 6, ...] then in order for timepoints to be removed there would need to be a gap of 100 ms (since the median diff between subsequent times is 1).

If a lot of timepoints are being trimmed, the sync files would probably have to look something like: [1, 2, 3, 4, 5, 110, 111, 112, 113, 114, 115, 116, 117, etc..]

I would need to plot the timepoints in one of those example problem sync files to confirm this though.

kschelonka commented 4 years ago

Tentatively assigning this to Taylor and I to take on later this week.

Matyasz commented 4 years ago

After debugging the code with @kschelonka (for two of the given IDs) through to the error that Doug had been receiving, we noticed something: There is a method called trim_discontiguous_times which takes the intervals, and if it finds a spike (any value that is 100 times the median by default) it deletes all data after that spike, even though it is a transient artifact.

We are unsure if this is the actual intent of the method as there is no docstring, however after searching for other uses of this method in the codebase, we found this comment next to one of the calls:

  # occasionally an extra set of frame times are acquired after the rest of 
  # the signals. We detect and remove these
  frame_times = sync_utilities.trim_discontiguous_times(frame_times)

It would be helpful if someone could let us know if this is the desired affect - do we want to lop off all interval values after a spike, or do we just want to remove the spike, or something else?

Attached are two plots of the interval arrays:

unknown

unknown (1)

kschelonka commented 4 years ago

@dougollerenshaw I think it's pretty clear from the above that there's a transient artifact(s) in the data. The issue is that the function Taylor and Nick called out truncate the data at that point -- I believe incorrectly. Would be good to get some more guidance about these data and whether or not we should be attempting to remove these transient artifacts before the truncate function (trim_discontiguous_times), or if it needs to be updated to ensure that it's trimmed at the end.

If you are able to locate any sessions that show a good example of the trim_discontiguous_times being used properly, that could also help us. That is, an example where there was some kind of "discontiguous" recording.

dougollerenshaw commented 4 years ago

@kschelonka and @Matyasz , thanks for digging deeper into this. I don't recall ever being involved with implementing the trim_discontiguous_times function. Is it possible to look at the git blame to see who implemented it and whether the specific use case that warranted it still exists?

From what I understand about the sync program, I can see how there could have been a motivation to implement this. Without automation in place to stop sync at the end of a recording session, it's possible that an operator could inadvertently start a new session, the timestamps for which would be appended to the previous session's sync file. But the gap in that case would be very big - likely many tens of seconds - so it seems that this function could remain in place, but the default gap could be increased dramatically to avoid these false-positives.

dougollerenshaw commented 4 years ago

Also, there is automation in place now that stops sync at the end of a session, further negating the need for this function.

kschelonka commented 4 years ago

@dougollerenshaw It's attributed to @NileGraddis, but I think it's a copy-paste since the commit is "715 move vsync trimming code to brain_observatory"

dougollerenshaw commented 4 years ago

Ah, it's possible that this is necessary for ephys recordings. I think they may have more distinct sub-experiments that are run in quick succession with short breaks in between. I'll let @NileGraddis weigh in on the actual use case and whether or not lengthening the threshold for ID'ing distinct sessions would cause trouble.

Matyasz commented 4 years ago

I believe the other location we found this method used is in the ephys code - is this right Kat?

If this is the case, it would be no problem to keep the threshold as is for the ephys code and raise it for this code, given how the method already makes use fo default arguments. 🙂

kschelonka commented 4 years ago

Yes, I did see it in the ecephys sync processing. We could certainly update the threshold as ya'll mentioned, but I'm not sure what a good threshold would be...

Matyasz commented 4 years ago

Is there even a case where we need to worry about doing this clipping? If the sync ending is automated now, then we might not even need this for the eye tracing code anymore.

It could be a good extra just-to-be-safe check to keep something like this though, with...a 1000x multiplier or something, yeah choosing a new threshold would be tough. So I'd be in favor of scrapping it from the eye tracking code if there aren't cases where we want it anymore.

Matyasz commented 3 years ago

Hey @dougollerenshaw, do you happen to have an example of an experiment ID where this issue doesn't occur? It could be really helpful for us when proposing a full solution to see a case where this trim_discontiguous_times function is actually useful (assuming that there is a case where this is needed).

dougollerenshaw commented 3 years ago

Hi @Matyasz, as far as I know, the trim_discontiguous_times function isn't being successfully applied to any ophys data. If you look at any session other than the ones that I identified as failures above, those will be examples of sessions where the function is being applied, but there are no gaps large enough to trigger it splitting the sync file.

AllenInstitute / AllenSDK

Clarify eye sync mismatch cases and propose solution #1682