AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
343 stars 149 forks source link

Behavior Ophys Experiment has NWB error ('EyeTracking') #2250

Closed alexpiet closed 2 years ago

alexpiet commented 3 years ago

Describe the bug This may be the same basic problem as https://github.com/AllenInstitute/AllenSDK/issues/2239 but for the eye tracking attributes. @danielsf

When loading a small number of ophys_experiment_ids, I get 2 warnings that This ophys session '806456687' has no eye tracking data. The session loads fine, but then session.eye_tracking throws an AttrributeError. I think this is currently impacting ~10 experiments in the platform paper dataset?

(1) Most importantly, if the eye_tracking data isn't available, I think we should return an empty dataframe, rather than throwing an attribution error. This would be consistent with the resolution in #2239 (2) Very minor, the warning says "This ophys session", but the number it displays is actually an ophys_experiment_id.

To Reproduce

from allensdk.brain_observatory.behavior.behavior_project_cache import VisualBehaviorOphysProjectCache as bpc cache_dir = loading.get_platform_analysis_cache_dir() # this is from visual_behavior_analysis cache = bpc.from_s3_cache(cache_dir=cache_dir) session = cache.get_behavior_ophys_experiment(806456687) # throws 4 NWB errors session.eye_tracking # throws attribution error

Expected behavior I expect the session object to be robust to missing data, consistent with missing data from passive sessions.

Actual Behavior Screen Shot 2021-10-08 at 9 39 11 AM

Environment (please complete the following information):

danielsf commented 3 years ago

@alexpiet This is an easy fix. I should be able to get it done today.

Regarding the warning: should we still be emitting the warning, or is an empty eye-tracking dataframe expected (much in the same way we expect empty lick and reward datasets for passive sessions)?

alexpiet commented 3 years ago

Thanks @danielsf

By the way, here are a few more ophys_experiment_ids I encountered if you want more to test on: 795953296, 833631914, 806456687

An empty eye-tracking dataframe should be much, much, less frequent than empty lick and reward datasets. An empty eye-tracking dataframe means that data collection failed for some reason, whereas empty lick and reward datasets is the intended part of data collection. I guess that means a warning is warranted, so lets leave the warning unless we get input from more people.

danielsf commented 3 years ago

@alexpiet At the risk of exploding this discussion:

On a whim, I tried to instantiate that session from_lims. I got the following error

    raise RuntimeError(f"Error! The number of sync file frame times "
RuntimeError: Error! The number of sync file frame times (135915) does not match the number of eye tracking frames (135917)!

Note: I got this error when I tried to instantiate the entire session, not just the eye_tracking data frame. I assume the right thing to do is make this error a warning and return an empty data frame?

alexpiet commented 3 years ago

That makes sense to me, however I think we are getting beyond my scope of decision making! Lets get input from @wbwakeman and @matchings

I know there were a lot of issues about the length of timestamps in the sync file, and this might be related to those issues? However I think the current issue is only impacting a small number of experiments (3), so we should tread lightly

danielsf commented 3 years ago

Sounds good. To be explicit, this code

from allensdk.brain_observatory.behavior.behavior_ophys_experiment import (
    BehaviorOphysExperiment)

exp_id = 806456687
exp = BehaviorOphysExperiment.from_lims(exp_id)

will fail with that error I mentioned above.

danielsf commented 3 years ago

@alexpiet Ignoring the from_lims issue, I've got a PR that should fix your problems with the NWB files. If you want to give it a spin, it is this branch

https://github.com/AllenInstitute/AllenSDK/tree/ticket/2250/missing/data/obj

matchings commented 3 years ago

i thought that we had some logic to truncate the last few timepoints if there was a mismatch of <5 frames (or some such threshold). These small mismatches occur quite often so i am confused why only these few sessions are this issue. But maybe we only applied that threshold if the timestamps were longer than the data, and this case is the other way around.

in any case, returning an empty dataframe is what i would prefer as well. a warning saying why its empty is also good.

danielsf commented 3 years ago

Here is the code you are thinking of (it occurs right before that error)

    # If n_sync exceeds n_eye_frames by <= 15,
    # just trim the excess sync pulses from the end
    # of the timestamps array.
    #
    # This solution was discussed in
    # https://github.com/AllenInstitute/AllenSDK/issues/1545

    if n_sync > n_eye_frames and n_sync <= n_eye_frames+15:
        frame_times = frame_times[:n_eye_frames]
        n_sync = len(frame_times)

It only comes into effect if n_sync > n_eye_tracking. If I recall the conversation from January correctly, the conclusion was that, if n_eye_tracking > n_sync, then there is no way to know the correct thing to do, since the sync timestamps are supposed to be the ultimate arbiter of truth.

matchings commented 3 years ago

ok got it. thanks for posting that bit of code.

so then the outcome will be that these 3 experiments with more sync times than data points will return an empty table for eye tracking? thats good with me.

danielsf commented 3 years ago

This should now be fixed. Branch rc/2.13.2 has the fix until we get around to actually releasing v2.13.2.