Closed wbwakeman closed 3 years ago
LIMS part collects all the inputs: http://stash.corp.alleninstitute.org/projects/TECH/repos/lims/browse/app/strategies/cam_nwb_strategy.rb
The module: http://stash.corp.alleninstitute.org/projects/INF/repos/lims2_modules/browse/CAM/cam_nwb/run_module.rb
Along with other files in: http://stash.corp.alleninstitute.org/projects/INF/repos/lims2_modules/browse/CAM/cam_nwb
Here is my review of what prevents the current pipeline form producing NWB2 files from existing visual coding data
The argschema input failures: When run with the existing data that I could find, using the input.json that existed for another job in this queue, we are missing:
{
"session_data": {
"behavior_session_id": [
"Missing data for required field."
],
"foraging_id": [
"Missing data for required field."
],
"events_file": [
"Missing data for required field."
],
"ophys_session_id": [
"Missing data for required field."
],
"eye_tracking_filepath": [
"Missing data for required field."
],
"imaging_plane_group": [
"Missing data for required field."
],
"plane_group_count": [
"Missing data for required field."
],
"eye_tracking_rig_geometry": [
"Missing data for required field."
],
"segmentation_mask_image_file": [
"Unknown field."
]
}
}
Errors also occur (in the asserts in /behavior/write_nwb/__main__.py
) that check that the json_session
is the same as both the lims_
and nwb_sessions
:
Lims_session
:
AssertionError: _average_projection on <allensdk.brain_observatory.behavior.behavior_ophys_experiment.BehaviorOphysExperiment object at 0x7f39bf18c470> did not equal _average_projection on <allensdk.brain_observatory.behavior.behavior_ophys_experiment.BehaviorOphysExperiment object at 0x7f39bf18cc88>
Nwb_session:
KeyError: 'behavior_session_id'
In allensdk/brain_observatory/behavior/metadata/behavior_metadata.py:
get_task_parameters
tries to reference the behavior
column of the input data
, but there is none.
date_of_acquisition
calls get_behavior_session_id
, which also throws a KeyError
when trying to access self.data['behavior_session_id']
behavior_session_uuid
also calls get_behavior_session_id
.
In allensdk/brain_observatory/behavior/metadata/behavior_ophys_metadata.py: The following three property methods throw errors:
KeyError: 'imaging_plane_group'
in the BehaviorSession
class when calling to_dict from the metadata. Specifically, the KeyError
occurs in BehaviorOphysJsonExtractor
KeyError: 'plane_group_count'
also in BehaviorOphysJsonExtractorKeyError: 'ophys_session_id'
also in BehaviorOphysJsonExtractorIn allensdk/brain_observatory/behavior/session_apis/data_io/behavior_nwb_api.py:
_add_stimulus_templates
causes an error at nwb.add_stimulus_template(…)
:
File "/allen/aibs/technology/conda/shared/miniconda/envs/asdk_dev/lib/python3.6/site-packages/allensdk/brain_observatory/nwb/__init__.py", line 441, in add_stimulus_template
for image_name, image_data in stimulus_template.items():
AttributeError: 'NoneType' object has no attribute 'items'
The other stimulus methods called in this method also give similar errors
In allensdk/brain_observatory/behavior/session_apis/data_io/behavior_ophys_nwb_api.py:
nwb.add_running_acquisition_to_nwbfile
goes through BehaviorOphysNwbApi save method and leads to
File "/allen/aibs/technology/conda/shared/miniconda/envs/asdk_dev/lib/python3.6/site-packages/allensdk/brain_observatory/nwb/__init__.py", line 341, in add_running_acquisition_to_nwbfile
data=running_acquisition_df['dx'].values,
TypeError: 'NoneType' object is not subscriptable
set_omitted_stop_time(stimulus_table=session_object.stimulus_presentations)
leads to KeyError: 'omitted' from stimulus_table['omitted']
add_stimulus_presentations
has the line stimulus_name_column = get_column_name(stimulus_table.columns, possible_names)
which leads to the error KeyError: 'Table expected one name column in intersection, found: []'
nwb.add_trials(nwbfile, session_object.trials, TRIAL_COLUMN_DESCRIPTION_DICT)
tries to use trials[['start_time', 'stop_time']]
which causes KeyError: "None of [Index(['start_time', 'stop_time'], dtype='object')] are in the [columns]"
nwb.add_task_parameters(nwbfile, session_object.task_parameters)
leads to
TypeError: TypeMap.__get_cls_dict.<locals>.__init__: missing argument 'stimulus_distribution', missing argument 'task', missing argument 'reward_volume', missing argument 'n_stimulus_frames', missing argument 'auto_reward_volume', missing argument 'session_type', missing argument 'response_window_sec', missing argument 'blank_duration_sec', missing argument 'stimulus_duration_sec', missing argument 'omitted_flash_fraction', missing argument 'stimulus'
self.add_events(nwbfile=nwbfile, events=session_object.events)
leads to KeyError: 'events'
when it tries events['events']
.
In allensdk/brain_observatory/behavior/session_apis/data_transforms/behavior_data_transforms.py
:
In get_licks
, lick_frames = (data["items"]["behavior"]["lick_sensors"][0]
gives a KeyError: 'behavior'
get_running_speed
will call get_running_acquisition_df
which will call get_running_df
which will try data["items"]["behavior"]["encoders"][0]["vsig"]
and fail with a KeyError: 'behavior'
The get_stimulus_presentations
method of the BehaviorDataTransforms
class calls the get_stimulus_presentations
method from /allensdk/brain_observatory/behavior/stimulus_processing/__init__.py
which calls get_visual_stimuli_df
from the same file and gives a KeyError: 'behavior'
when it tries stimuli = data['items']['behavior']['stimuli']
Similarly the get_stimulus_templates
class method calls another method of the same name and tries pkl_stimuli = pkl['items']['behavior']['stimuli']
which leads to a KeyError: 'behavior'
The get_trials
method also fails.
In allensdk/brain_observatory/behavior/session_apis/data_transforms/behavior_ophys_data_transforms.py
:
Same KeyError: 'imaging_plane_group'
as earlier in allensdk/brain_observatory/behavior/session_apis/data_io/behavior_ophys_json_api.py
get_raw_dff_data
tries to read the roi_names
field form the DFF h5 file, but it does not have one. It only has a data
field.
get_dff_traces
calls get_raw_dff_traces
which fails as mentioned above.
get_rewards
tries pd.DataFrame(data["items"]["behavior"]["trial_log"])
, which fails with KeyError: 'behavior'
In get_corrected_fluorescence_traces
, the following is raised
if not np.in1d(cell_roi_id_list, corrected_fluorescence_roi_id).all():
raise RuntimeError("cell_specimen_table contains ROI IDs "
"not present in corrected_fluorescence_traces")
get_motion_correction
fails because the motion correction .csv does not have any columns named x
or y
. In fact, it looks like it may not have any column names at all, this is what I get from printing the head
of the data frame:
0 -3.15372 1.81918 -3.15372.1 1.81918.1 0.1 0.2 0.3 0.493704
0 1 -5.171610 1.40385 -5.171610 1.40385 0 0 0 0.469867
1 2 -4.842250 1.43501 -4.842250 1.43501 0 0 0 0.561573
2 3 -2.241590 1.67159 -2.241590 1.67159 0 0 0 0.514286
3 4 -0.356072 1.70972 -0.356072 1.70972 0 0 0 0.554047
4 5 -1.032390 1.08631 -1.032390 1.08631 0 0 0 0.517384
get_events
also failed, there is no events_file
in the input.json
In allensdk/brain_observatory/nwb/__init__.py
:
add_running_speed_to_nwbfile
fails because there is no speed
column in the running_speed
data frame passed in (comes from session_object.reunning_speed
, which is actually empty here)
In allensdk/brain_observatory/sync_dataset.py
:
get_edges
fails because permissive
is set to False
, and raises the error KeyError: "none of ['lick_times', 'lick_sensor'] were found in this dataset's line labels"
It appears that many of these things might just be that key names have changed between visual coding and visual behavior. Is that your impression?
Yeah, it looks to me like a combination of things changing names, and also the input data being organized totally differently. Like how there are just no column names in the motion correction
file, no event detection
file at all, and no roi_names
in the dff
file
Here are my findings on the differences between the old and new data formats
data file | differences |
---|---|
events file | file doesn't exist |
eye tracking file | file doesn't exist |
eye gaze mapping | file doesn't exist |
dff file | Old .h5 keys: ['data'] New .h5 keys: ['data', 'num_small_baseline_frames', 'roi_names', 'sigma_dff'] |
rigid motion transform file | Old data has no column names, but the mapping is ["index", "x", "y", "a", "b", "c", "d", "e", "f"] and can be found here. New data columns: ['framenumber', 'x', 'y', 'x_pre_clip', 'y_pre_clip', 'correlation'] These still do not match up, but the only error I encountered when running the pipeline was related to the ‘x’ and ‘y’ columns. The lack of other columns may not pose a problem. |
Behavior stimulus file | Old data keys: ['config', 'config_path', 'di', 'do', 'droppedframes', 'fps', 'intervalsms', 'items', 'lims_config', 'miniwindow', 'monitor', 'monitor_brightness', 'monitor_contrast', 'movie_output', 'ni_config', 'nidaq_tasks', 'params', 'platform', 'post_blank_sec', 'pre_blank_sec', 'primary_stimulus', 'script', 'scripttext', 'showmouse', 'start_time', 'startdatetime', 'stimuli', 'stop_time', 'stopdatetime', 'sweepstim_text', 'syncpulse', 'syncpulselines', 'syncpulseport', 'syncsqr', 'syncsqrloc', 'syncsqrsize', 'total_frames', 'trigger_delay_sec', 'unpickleable', 'vsynccount', 'wheight', 'window', 'wwidth'] Old data[‘items’] keys: ['sync_square', 'foraging', 'control_stream'] The major issue encountered with this file is the fact that data['items'] has no behavior keyNew data keys: ['comp_id', 'items', 'platform_info', 'rig_id', 'script', 'session_uuid', 'start_time', 'stop_time', 'threads', 'unpickleable'] New data[‘items’][‘behavior’] keys (behavior is only key under items ): ['ai', 'ao', 'auto_update', 'behavior_path', 'behavior_text', 'cl_params', 'config', 'config_path', 'custom_output_path', 'encoders', 'intervalsms', 'items', 'lick_sensors', 'nidaq_tasks', 'omitted_flash_frame_log', 'params', 'rewards', 'rewards_dispensed', 'stimuli', 'sync_pulse', 'trial_count', 'trial_log', 'unpickleable', 'update_count', 'volume_dispensed', 'window'] |
sync file | looks good |
demixed traces file | looks good |
This looks promising.
For the "behavior stimulus file", this Visual Coding data does not have 'behavior' so (theoretically) just don't need any information under data['items']/behavior
.
For the others, we should be able to get them by processing through the pipeline.
This work will support creation and release of NWB2 files for 4 different Visual Coding projects (i.e. no behavior streams for the data):
For Visual Coding targeted experiment 766270826, manually create an NWB2 (schema version 2.2.5) file for the experiment using PyNWB v1.4.
Document the inputs and how it may be necessary to "modify or circumvent" the allensdk.brain_observatory.behavior.write_nwb module for this non-behavior session.
Data files for this experiment exist at:
Tasks:
This is a timeboxed effort for a maximum of 5 days.