catalystneuro / visual-coding-to-nwb-v2

Conversion of old v1 NWB files from https://registry.opendata.aws/allen-brain-observatory/ to v2 so they can be uploaded to DANDI.
MIT License
0 stars 0 forks source link

Verify consistency of neurodata object presence #11

Closed CodyCBakerPhD closed 11 months ago

CodyCBakerPhD commented 12 months ago

Evaluate how consistent various, especially behavioral, neurodata objects are across the datasets

CodyCBakerPhD commented 11 months ago

Reference script

```python from pathlib import Path from collections import defaultdict from typing import Union, List import h5py from natsort.natsort import natsorted SKIP_KEYS = ["timeseries", "corrected", "original"] ABBREVIATE_KEYS = ["imaging_plane_1"] def _recurse_structure(dataset_list: list, group: h5py.Group, key: Union[str, None]): if isinstance(group[key], h5py.Group): next_group = group[key] for next_key in next_group: if next_key in SKIP_KEYS: continue if next_key in ABBREVIATE_KEYS: dataset_list.append(next_group.name) continue _recurse_structure(dataset_list=dataset_list, group=next_group, key=next_key) else: dataset_list.append(group[key].name) # Is the full path within the HDF5 file, including '/' def _find_all_datasets(file: h5py.File) -> List[str]: dataset_list = list() for key in file: _recurse_structure(dataset_list=dataset_list, group=file, key=key) return dataset_list base_path = Path("G:/visual-coding/ophys_experiment_data") v1_nwbfiles = list(base_path.rglob("*.nwb")) datasets_per_file = defaultdict(list) for v1_nwbfile in v1_nwbfiles: file = h5py.File(name=v1_nwbfile, mode="r") datasets_per_file[v1_nwbfile] = _find_all_datasets(file=file) all_datasets = list() for datasets in datasets_per_file.values(): all_datasets.extend(datasets) unique_datasets = natsorted(list(set(all_datasets))) # import json # print(json.dumps(unique_datasets, indent=4)) unique_counts = {key: 0 for key in unique_datasets} for datasets in datasets_per_file.values(): for dataset in datasets: unique_counts[dataset] += 1 ```
CodyCBakerPhD commented 11 months ago

Results

```python { "/file_create_date": 1518, "/general/For more information": 1518, "/general/devices/2-photon microscope": 1518, "/general/devices/display monitor": 1518, "/general/devices/eye-tracking camera": 1518, "/general/experiment_container_id": 1518, "/general/fov": 1518, "/general/generated_by": 1518, "/general/institution": 1518, "/general/ophys_experiment_id": 1518, "/general/ophys_experiment_name": 1518, "/general/optophysiology": 1518, "/general/pixel_size": 1518, "/general/session_id": 1518, "/general/session_type": 1518, "/general/specimen_name": 1518, "/general/subject/age": 1518, "/general/subject/description": 1518, "/general/subject/genotype": 1518, "/general/subject/sex": 1518, "/general/subject/species": 1518, "/general/subject/subject_id": 1518, "/general/targeted_structure": 1518, "/identifier": 1518, "/nwb_version": 1518, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed/data": 1518, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed/num_samples": 1518, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed/timestamps": 1518, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed_index/data": 150, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed_index/indexed_timeseries/data": 150, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed_index/indexed_timeseries/num_samples": 150, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed_index/indexed_timeseries/timestamps": 150, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed_index/indexed_timeseries_path": 150, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed_index/num_samples": 150, "/processing/brain_observatory_pipeline/BehavioralTimeSeries/running_speed_index/timestamps": 150, "/processing/brain_observatory_pipeline/DfOverF": 1518, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location/data": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location/features": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location/num_samples": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location/reference_frame": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location/timestamps": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/data": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/indexed_timeseries/data": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/indexed_timeseries/features": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/indexed_timeseries/num_samples": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/indexed_timeseries/reference_frame": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/indexed_timeseries/timestamps": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/indexed_timeseries_path": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/num_samples": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_index/timestamps": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical/data": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical/features": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical/num_samples": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical/reference_frame": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical/timestamps": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/data": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/indexed_timeseries/data": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/indexed_timeseries/features": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/indexed_timeseries/num_samples": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/indexed_timeseries/reference_frame": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/indexed_timeseries/timestamps": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/indexed_timeseries_path": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/num_samples": 363, "/processing/brain_observatory_pipeline/EyeTracking/pupil_location_spherical_index/timestamps": 363, "/processing/brain_observatory_pipeline/Fluorescence": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_demixed_signal/data": 1467, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_demixed_signal/num_samples": 1467, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_demixed_signal/roi_names": 1467, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_demixed_signal/segmentation_interface": 1467, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_demixed_signal/segmentation_interface/cell_specimen_ids": 1467, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_demixed_signal/segmentation_interface/roi_ids": 1467, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_demixed_signal/segmentation_interface_path": 1467, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_demixed_signal/timestamps": 1467, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/data": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/num_samples": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/r": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/rmse": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/roi_names": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/segmentation_interface": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/segmentation_interface/cell_specimen_ids": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/segmentation_interface/roi_ids": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/segmentation_interface_path": 1518, "/processing/brain_observatory_pipeline/Fluorescence/imaging_plane_1_neuropil_response/timestamps": 1518, "/processing/brain_observatory_pipeline/ImageSegmentation": 1518, "/processing/brain_observatory_pipeline/ImageSegmentation/cell_specimen_ids": 1518, "/processing/brain_observatory_pipeline/ImageSegmentation/roi_ids": 1518, "/processing/brain_observatory_pipeline/MotionCorrection/2p_image_series/original_path": 1518, "/processing/brain_observatory_pipeline/MotionCorrection/2p_image_series/xy_translation/data": 1518, "/processing/brain_observatory_pipeline/MotionCorrection/2p_image_series/xy_translation/feature_description": 1518, "/processing/brain_observatory_pipeline/MotionCorrection/2p_image_series/xy_translation/num_samples": 1518, "/processing/brain_observatory_pipeline/MotionCorrection/2p_image_series/xy_translation/timestamps": 1518, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size/data": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size/num_samples": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size/timestamps": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size_index/data": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size_index/indexed_timeseries/data": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size_index/indexed_timeseries/num_samples": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size_index/indexed_timeseries/timestamps": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size_index/indexed_timeseries_path": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size_index/num_samples": 363, "/processing/brain_observatory_pipeline/PupilTracking/pupil_size_index/timestamps": 363, "/session_description": 1518, "/session_start_time": 1518, "/stimulus/presentation/drifting_gratings_stimulus/data": 506, "/stimulus/presentation/drifting_gratings_stimulus/features": 506, "/stimulus/presentation/drifting_gratings_stimulus/frame_duration": 506, "/stimulus/presentation/drifting_gratings_stimulus/num_samples": 506, "/stimulus/presentation/drifting_gratings_stimulus/timestamps": 506, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/data": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/frame_duration": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/indexed_timeseries/bits_per_pixel": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/indexed_timeseries/data": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/indexed_timeseries/dimension": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/indexed_timeseries/field_of_view": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/indexed_timeseries/format": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/indexed_timeseries_path": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/num_samples": 392, "/stimulus/presentation/locally_sparse_noise_4deg_stimulus/timestamps": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/data": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/frame_duration": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/indexed_timeseries/bits_per_pixel": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/indexed_timeseries/data": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/indexed_timeseries/dimension": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/indexed_timeseries/field_of_view": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/indexed_timeseries/format": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/indexed_timeseries_path": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/num_samples": 392, "/stimulus/presentation/locally_sparse_noise_8deg_stimulus/timestamps": 392, "/stimulus/presentation/locally_sparse_noise_stimulus/data": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/frame_duration": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/indexed_timeseries/bits_per_pixel": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/indexed_timeseries/data": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/indexed_timeseries/dimension": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/indexed_timeseries/field_of_view": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/indexed_timeseries/format": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/indexed_timeseries_path": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/num_samples": 114, "/stimulus/presentation/locally_sparse_noise_stimulus/timestamps": 114, "/stimulus/presentation/natural_movie_one_stimulus/data": 1518, "/stimulus/presentation/natural_movie_one_stimulus/frame_duration": 1518, "/stimulus/presentation/natural_movie_one_stimulus/indexed_timeseries/bits_per_pixel": 1518, "/stimulus/presentation/natural_movie_one_stimulus/indexed_timeseries/data": 1518, "/stimulus/presentation/natural_movie_one_stimulus/indexed_timeseries/dimension": 1518, "/stimulus/presentation/natural_movie_one_stimulus/indexed_timeseries/field_of_view": 1518, "/stimulus/presentation/natural_movie_one_stimulus/indexed_timeseries/format": 1518, "/stimulus/presentation/natural_movie_one_stimulus/indexed_timeseries_path": 1518, "/stimulus/presentation/natural_movie_one_stimulus/num_samples": 1518, "/stimulus/presentation/natural_movie_one_stimulus/timestamps": 1518, "/stimulus/presentation/natural_movie_three_stimulus/data": 506, "/stimulus/presentation/natural_movie_three_stimulus/frame_duration": 506, "/stimulus/presentation/natural_movie_three_stimulus/indexed_timeseries/bits_per_pixel": 506, "/stimulus/presentation/natural_movie_three_stimulus/indexed_timeseries/data": 506, "/stimulus/presentation/natural_movie_three_stimulus/indexed_timeseries/dimension": 506, "/stimulus/presentation/natural_movie_three_stimulus/indexed_timeseries/field_of_view": 506, "/stimulus/presentation/natural_movie_three_stimulus/indexed_timeseries/format": 506, "/stimulus/presentation/natural_movie_three_stimulus/indexed_timeseries_path": 506, "/stimulus/presentation/natural_movie_three_stimulus/num_samples": 506, "/stimulus/presentation/natural_movie_three_stimulus/timestamps": 506, "/stimulus/presentation/natural_movie_two_stimulus/data": 506, "/stimulus/presentation/natural_movie_two_stimulus/frame_duration": 506, "/stimulus/presentation/natural_movie_two_stimulus/indexed_timeseries/bits_per_pixel": 506, "/stimulus/presentation/natural_movie_two_stimulus/indexed_timeseries/data": 506, "/stimulus/presentation/natural_movie_two_stimulus/indexed_timeseries/dimension": 506, "/stimulus/presentation/natural_movie_two_stimulus/indexed_timeseries/field_of_view": 506, "/stimulus/presentation/natural_movie_two_stimulus/indexed_timeseries/format": 506, "/stimulus/presentation/natural_movie_two_stimulus/indexed_timeseries_path": 506, "/stimulus/presentation/natural_movie_two_stimulus/num_samples": 506, "/stimulus/presentation/natural_movie_two_stimulus/timestamps": 506, "/stimulus/presentation/natural_scenes_stimulus/data": 506, "/stimulus/presentation/natural_scenes_stimulus/frame_duration": 506, "/stimulus/presentation/natural_scenes_stimulus/indexed_timeseries/bits_per_pixel": 506, "/stimulus/presentation/natural_scenes_stimulus/indexed_timeseries/data": 506, "/stimulus/presentation/natural_scenes_stimulus/indexed_timeseries/dimension": 506, "/stimulus/presentation/natural_scenes_stimulus/indexed_timeseries/field_of_view": 506, "/stimulus/presentation/natural_scenes_stimulus/indexed_timeseries/format": 506, "/stimulus/presentation/natural_scenes_stimulus/indexed_timeseries_path": 506, "/stimulus/presentation/natural_scenes_stimulus/num_samples": 506, "/stimulus/presentation/natural_scenes_stimulus/timestamps": 506, "/stimulus/presentation/spontaneous_stimulus/data": 1518, "/stimulus/presentation/spontaneous_stimulus/frame_duration": 1518, "/stimulus/presentation/spontaneous_stimulus/num_samples": 1518, "/stimulus/presentation/spontaneous_stimulus/timestamps": 1518, "/stimulus/presentation/static_gratings_stimulus/data": 506, "/stimulus/presentation/static_gratings_stimulus/features": 506, "/stimulus/presentation/static_gratings_stimulus/frame_duration": 506, "/stimulus/presentation/static_gratings_stimulus/num_samples": 506, "/stimulus/presentation/static_gratings_stimulus/timestamps": 506, "/stimulus/templates/locally_sparse_noise_4deg_image_stack/bits_per_pixel": 392, "/stimulus/templates/locally_sparse_noise_4deg_image_stack/data": 392, "/stimulus/templates/locally_sparse_noise_4deg_image_stack/dimension": 392, "/stimulus/templates/locally_sparse_noise_4deg_image_stack/field_of_view": 392, "/stimulus/templates/locally_sparse_noise_4deg_image_stack/format": 392, "/stimulus/templates/locally_sparse_noise_8deg_image_stack/bits_per_pixel": 392, "/stimulus/templates/locally_sparse_noise_8deg_image_stack/data": 392, "/stimulus/templates/locally_sparse_noise_8deg_image_stack/dimension": 392, "/stimulus/templates/locally_sparse_noise_8deg_image_stack/field_of_view": 392, "/stimulus/templates/locally_sparse_noise_8deg_image_stack/format": 392, "/stimulus/templates/locally_sparse_noise_image_stack/bits_per_pixel": 114, "/stimulus/templates/locally_sparse_noise_image_stack/data": 114, "/stimulus/templates/locally_sparse_noise_image_stack/dimension": 114, "/stimulus/templates/locally_sparse_noise_image_stack/field_of_view": 114, "/stimulus/templates/locally_sparse_noise_image_stack/format": 114, "/stimulus/templates/natural_movie_one_image_stack/bits_per_pixel": 1518, "/stimulus/templates/natural_movie_one_image_stack/data": 1518, "/stimulus/templates/natural_movie_one_image_stack/dimension": 1518, "/stimulus/templates/natural_movie_one_image_stack/field_of_view": 1518, "/stimulus/templates/natural_movie_one_image_stack/format": 1518, "/stimulus/templates/natural_movie_three_image_stack/bits_per_pixel": 506, "/stimulus/templates/natural_movie_three_image_stack/data": 506, "/stimulus/templates/natural_movie_three_image_stack/dimension": 506, "/stimulus/templates/natural_movie_three_image_stack/field_of_view": 506, "/stimulus/templates/natural_movie_three_image_stack/format": 506, "/stimulus/templates/natural_movie_two_image_stack/bits_per_pixel": 506, "/stimulus/templates/natural_movie_two_image_stack/data": 506, "/stimulus/templates/natural_movie_two_image_stack/dimension": 506, "/stimulus/templates/natural_movie_two_image_stack/field_of_view": 506, "/stimulus/templates/natural_movie_two_image_stack/format": 506, "/stimulus/templates/natural_scenes_image_stack/bits_per_pixel": 506, "/stimulus/templates/natural_scenes_image_stack/data": 506, "/stimulus/templates/natural_scenes_image_stack/dimension": 506, "/stimulus/templates/natural_scenes_image_stack/field_of_view": 506, "/stimulus/templates/natural_scenes_image_stack/format": 506 } ```
CodyCBakerPhD commented 11 months ago

Summary

CodyCBakerPhD commented 11 months ago

Adding note here...

From https://github.com/catalystneuro/visual-coding-to-nwb-v2/blob/d6915d562dd9adaef1a4b05672418505c1a6ff98/src/visual_coding_to_nwb_v2/visual_coding_ophys/scripts/check_all_excitation_and_emission_lambdas.py

it was discovered that some files specify their excitation wavelength as 910 microns instead of 910 nanometers

CodyCBakerPhD commented 11 months ago

Additionally, file["general"]["pixel_size"] is always 0.78 (um)

CodyCBakerPhD commented 11 months ago

Additionally, file["general"]["optophysiology"]["imaging_plane_1"]["imaging depth"] is always of the form {integer} microns

CodyCBakerPhD commented 11 months ago

Additionally, file["processing"["brain_observatory_pipeline"]["ImageSegmentation"]["imaging_plane_1"]["roi_{id}"]["roi_description"] is always simply cell

CodyCBakerPhD commented 11 months ago

Additionally, file["processing"["brain_observatory_pipeline"]["ImageSegmentation"]["imaging_plane_1"]["roi_{id}"]["pix_mask_weight"] is always simply 1.0

CodyCBakerPhD commented 11 months ago

Additionally, file["general"]["devices"]["2-photon microscope"] is always Nikon A1R-MP multiphoton microscope. CAM2P.2 Please see http://help.brain-map.org/display/observatory/Allen+Brain+Observatory for details

Even though white paper mentiones a scientifica vivoscope

CodyCBakerPhD commented 11 months ago

Similarly, the first part of the device value for display monitor and eye-tracking camera are always ASUS PA248Q monitor and AVT Mako-G032B, respectively.

And all 3 devices are included in every file