AllenInstitute / visual_behavior_analysis

Python package for analyzing behavioral data for Brain Observatory: Visual Behavior
Other
22 stars 6 forks source link

ophys timestamp refactoring failing #645

Open dougollerenshaw opened 4 years ago

dougollerenshaw commented 4 years ago

I noticed a discrepancy in the ophys_timestamp count and the length of the dff_traces for all of the experiments associated with ophys_session_id 845842114 (ophys_experiment_ids = [846546341, 846546326, 846546328, 846546335, 846546339, 846546337, 846546331, 846546333]).

In all cases, the timestamp vector is exactly 4000 frames longer than the dff_frames vector (timestamp length = 52274, dff_traces length = 48274).

I initially thought it was a data issue, but it turns out that the AllenSDK is returning the correct ophys_timestamps length. The refactoring in the VBA BehaviorOphysDataset seems to be introducing the error for this session.

Details (for one of the eight experiments listed above): Getting the timestamps directly from the SDK yields the correct timestamp length:

from allensdk.brain_observatory.behavior.behavior_ophys_session import BehaviorOphysSession
experiment_id = 846546339
session = BehaviorOphysSession.from_lims(experiment_id)
len(session.ophys_timestamps)

Returns: 48274

But getting the timestamps through VBA returns the incorrect value:

import visual_behavior.data_access.loading as loading
dataset = loading.get_ophys_dataset(experiment_id)
len(dataset.ophys_timestamps)

Returns: 52274

The incorrect value seems to be being overwritten here: https://github.com/AllenInstitute/visual_behavior_analysis/blob/f760874f5a0273cd30eb204dbf874563af34c229/visual_behavior/data_access/loading.py#L361

Is it possible that this step of overwriting the ophys_timestamps was necessary before the SDK was set up to deal with mesoscope timestamps? Is it still necessary to do this?

dougollerenshaw commented 4 years ago

More info: If I disable this line of code, I get the correct answer from VBA.

I get the same incorrect answer for all 8 experiments associated with this session using the code linked above.

I have not been able to find any other experiments for which this is a problem.

For a different experiment/session (session_id 954954402, experiment_id 958527481, also from mesoscope), I get the correct answer (48332) with or without this line of code.

dougollerenshaw commented 4 years ago

Here's what I believe to be true. @matchings, can you chime in on this?

So I'd like to propose that we simply remove this processing step and use the timestamps that come directly from the SDK. A PR is coming.

dougollerenshaw commented 4 years ago

After more digging, I have to retract my last comment above. The story is obviously more complicated. For experiment_id 958527481, I noted above that whether or not VBA overwrote the timestamps, the length was correct. That is true, but the underlying values are different.

Here's what I get using the current code that overwrites mesoscope timestamps: image

But if I comment out lines 359-365 here: https://github.com/AllenInstitute/visual_behavior_analysis/blob/f760874f5a0273cd30eb204dbf874563af34c229/visual_behavior/data_access/loading.py#L359 and lines 352-355 here: https://github.com/AllenInstitute/visual_behavior_analysis/blob/f760874f5a0273cd30eb204dbf874563af34c229/visual_behavior/ophys/io/convert_level_1_to_level_2.py#L352

I get the same length, but different underlying values: image

So, the initial point remains that something is going wrong for the 8 experiments associated with ophys_session_id 845842114, but I no longer have confidence that the SDK is returning correct values.