NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
174 stars 85 forks source link

[idea] provide nwb-ls script to display basic details about a .nwb file #1052

Open yarikoptic opened 4 years ago

yarikoptic commented 4 years ago

Generic tools, such as h5dump, without NWB knowledge about data types etc are too crude hard to use. pynwb provides nice pythonic interfaces... I wondered if there could be a script/entrypoint provided by pynwb library, names e.g. nwb-ls which would per file provide very basic information about # of datasets, time series etc, and may be unique neuro datatypes etc present in the file.

E.g., although a simpler case, we have a similar tool nib-ls in nibabel IO library for neuroimaging. It allows to quickly display basic details about neuroimaging files at hand, even with basic stats if requested via additional flag, e.g.

Example of nib-ls output ```shell dbic/QA$ nib-ls -s sub-emmet*/ses*/func/*.nii.gz sub-emmet/ses-20180508/func/sub-emmet_ses-20180508_task-rest_acq-p2_bold.nii.gz int16 [ 80, 80, 30, 200] 3.00x3.00x3.99x2.00 sform [37906520] [1, 1.4e+03] sub-emmet/ses-20180521/func/sub-emmet_ses-20180521_task-rest_acq-3mm_bold.nii.gz int16 [ 82, 82, 48, 150] 3.02x3.02x3.00x0.43 sform [45633462] [1, 8.3e+02] sub-emmet/ses-20180521/func/sub-emmet_ses-20180521_task-rest_acq-p2Xs4X35mm_bold.nii.gz int16 [ 80, 80, 32, 200] 3.00x3.00x3.99x2.00 sform [40428007] [1, 1.4e+03] sub-emmet/ses-20180521/func/sub-emmet_ses-20180521_task-rest_acq-p2_bold.nii.gz int16 [ 80, 80, 30, 200] 3.00x3.00x3.99x2.00 sform [37898743] [1, 1.3e+03] sub-emmet/ses-20180531/func/sub-emmet_ses-20180531_task-rest_acq-3mm_bold.nii.gz int16 [ 82, 82, 48, 150] 3.02x3.02x3.00x0.43 sform [44318617] [1, 8.6e+02] sub-emmet/ses-20180531/func/sub-emmet_ses-20180531_task-rest_acq-p2_bold.nii.gz int16 [ 80, 80, 30, 200] 3.00x3.00x3.99x2.00 sform [37914970] [1, 1.3e+03] ```
rly commented 4 years ago

I would like such a tool as well. Just printing the NWBFile object gives some basic information. We could write a simple script that outputs just that:

Output from printing an NWBFile ``` root Fields: acquisition: { Auxiliary_input_11_eyetracker_x_voltage , Auxiliary_input_11_eyetracker_y_voltage , Auxiliary_input_11_lever_voltage , Wideband_voltages_32ch-array , Wideband_voltages_test_electrode_1 , Wideband_voltages_test_electrode_2 , Wideband_voltages_test_electrode_3 } devices: { ASL_Eye-trac_6_via_OmniPlex_v1.14.0 , Manual_lever_via_OmniPlex_v1.14.0 , OmniPlex_v1.14.0 } electrode_groups: { 32ch-array , test_electrode_1 , test_electrode_2 , test_electrode_3 } electrodes: electrodes experiment_description: Neural correlates of visual attention across the pulvinar experimenter: ['Ryan Ly'] file_create_date: [datetime.datetime(2019, 6, 6, 19, 29, 31, 649963, tzinfo=tzoffset(None, -25200))] identifier: M20170127 institution: Princeton University lab: Kastner Lab processing: { ecephys } session_description: Pulvinar recording from McCartney session_id: M20170127 session_start_time: 2017-01-27 18:39:03-05:00 timestamps_reference_time: 2017-01-27 18:39:03-05:00 units: units ```

I think that is pretty good, though the output reflects the organization of the PyNWB object and types, not the schema organization. e.g. experimenter is part of the general group within the NWBFile group according to the schema, but is a property of the NWBFile object above.

Or we could have the function print a more extensive hierarchy listing of the hdf5 object, perhaps by adapting the ls function of deepdish:

Example output from deepdish: ```shell $ python -m deepdish.io.ls gratings-task-data-to-nwb\output\20170127-g1_10s.nwb /acquisition dict /acquisition/Auxiliary_input_11_eyetracker_x_voltage dict /acquisition/Auxiliary_input_11_eyetracker_x_voltage/comments 'FP126' (5) [unicode] /acquisition/Auxiliary_input_11_eyetracker_x_voltage/data array (10000, 1) [int16] /acquisition/Auxiliary_input_11_eyetracker_x_voltage/description 'Auxiliary input, sourc...' (48) [unicode] /acquisition/Auxiliary_input_11_eyetracker_x_voltage/help 'General time series ob...' (26) [unicode] /acquisition/Auxiliary_input_11_eyetracker_x_voltage/namespace 'core' (4) [unicode] /acquisition/Auxiliary_input_11_eyetracker_x_voltage/neurodata_type 'TimeSeries' (10) [unicode] /acquisition/Auxiliary_input_11_eyetracker_x_voltage/starting_time array () [float64] /acquisition/Auxiliary_input_11_eyetracker_y_voltage dict /acquisition/Auxiliary_input_11_eyetracker_y_voltage/comments 'FP127' (5) [unicode] /acquisition/Auxiliary_input_11_eyetracker_y_voltage/data array (10000, 1) [int16] /acquisition/Auxiliary_input_11_eyetracker_y_voltage/description 'Auxiliary input, sourc...' (48) [unicode] /acquisition/Auxiliary_input_11_eyetracker_y_voltage/help 'General time series ob...' (26) [unicode] /acquisition/Auxiliary_input_11_eyetracker_y_voltage/namespace 'core' (4) [unicode] /acquisition/Auxiliary_input_11_eyetracker_y_voltage/neurodata_type 'TimeSeries' (10) [unicode] /acquisition/Auxiliary_input_11_eyetracker_y_voltage/starting_time array () [float64] /acquisition/Auxiliary_input_11_lever_voltage dict /acquisition/Auxiliary_input_11_lever_voltage/comments 'FP128' (5) [unicode] /acquisition/Auxiliary_input_11_lever_voltage/data array (10000, 1) [int16] /acquisition/Auxiliary_input_11_lever_voltage/description 'Auxiliary input, sourc...' (41) [unicode] /acquisition/Auxiliary_input_11_lever_voltage/help 'General time series ob...' (26) [unicode] /acquisition/Auxiliary_input_11_lever_voltage/namespace 'core' (4) [unicode] /acquisition/Auxiliary_input_11_lever_voltage/neurodata_type 'TimeSeries' (10) [unicode] /acquisition/Auxiliary_input_11_lever_voltage/starting_time array () [float64] /acquisition/Wideband_voltages_32ch-array dict /acquisition/Wideband_voltages_32ch-array/comments 'WB001, WB002, WB003, W...' (222) [unicode] /acquisition/Wideband_voltages_32ch-array/data array (400000, 32) [int16] /acquisition/Wideband_voltages_32ch-array/description 'Wideband electrodes, g...' (37) [unicode] /acquisition/Wideband_voltages_32ch-array/electrodes array (32,) [int32] /acquisition/Wideband_voltages_32ch-array/help 'Stores acquired voltag...' (58) [unicode] /acquisition/Wideband_voltages_32ch-array/namespace 'core' (4) [unicode] /acquisition/Wideband_voltages_32ch-array/neurodata_type 'ElectricalSeries' (16) [unicode] /acquisition/Wideband_voltages_32ch-array/starting_time array () [float64] /acquisition/Wideband_voltages_test_electrode_1 dict /acquisition/Wideband_voltages_test_electrode_1/comments 'WB097' (5) [unicode] /acquisition/Wideband_voltages_test_electrode_1/data array (400000, 1) [int16] /acquisition/Wideband_voltages_test_electrode_1/description 'Wideband electrodes, g...' (43) [unicode] /acquisition/Wideband_voltages_test_electrode_1/electrodes array (1,) [int32] /acquisition/Wideband_voltages_test_electrode_1/help 'Stores acquired voltag...' (58) [unicode] /acquisition/Wideband_voltages_test_electrode_1/namespace 'core' (4) [unicode] /acquisition/Wideband_voltages_test_electrode_1/neurodata_type 'ElectricalSeries' (16) [unicode] /acquisition/Wideband_voltages_test_electrode_1/starting_time array () [float64] /acquisition/Wideband_voltages_test_electrode_2 dict /acquisition/Wideband_voltages_test_electrode_2/comments 'WB098' (5) [unicode] /acquisition/Wideband_voltages_test_electrode_2/data array (400000, 1) [int16] /acquisition/Wideband_voltages_test_electrode_2/description 'Wideband electrodes, g...' (43) [unicode] /acquisition/Wideband_voltages_test_electrode_2/electrodes array (1,) [int32] /acquisition/Wideband_voltages_test_electrode_2/help 'Stores acquired voltag...' (58) [unicode] /acquisition/Wideband_voltages_test_electrode_2/namespace 'core' (4) [unicode] /acquisition/Wideband_voltages_test_electrode_2/neurodata_type 'ElectricalSeries' (16) [unicode] /acquisition/Wideband_voltages_test_electrode_2/starting_time array () [float64] /acquisition/Wideband_voltages_test_electrode_3 dict /acquisition/Wideband_voltages_test_electrode_3/comments 'WB099' (5) [unicode] /acquisition/Wideband_voltages_test_electrode_3/data array (400000, 1) [int16] /acquisition/Wideband_voltages_test_electrode_3/description 'Wideband electrodes, g...' (43) [unicode] /acquisition/Wideband_voltages_test_electrode_3/electrodes array (1,) [int32] /acquisition/Wideband_voltages_test_electrode_3/help 'Stores acquired voltag...' (58) [unicode] /acquisition/Wideband_voltages_test_electrode_3/namespace 'core' (4) [unicode] /acquisition/Wideband_voltages_test_electrode_3/neurodata_type 'ElectricalSeries' (16) [unicode] /acquisition/Wideband_voltages_test_electrode_3/starting_time array () [float64] /analysis dict /file_create_date Node /general dict /general/data_collection Node /general/devices dict /general/devices/ASL_Eye-trac_6_via_OmniPlex_v1.14.0 dict /general/devices/ASL_Eye-trac_6_via_OmniPlex_v1.14.0/help 'A recording device e.g...' (33) [unicode] /general/devices/ASL_Eye-trac_6_via_OmniPlex_v1.14.0/namespace 'core' (4) [unicode] /general/devices/ASL_Eye-trac_6_via_OmniPlex_v1.14.0/neurodata_type 'Device' (6) [unicode] /general/devices/Manual_lever_via_OmniPlex_v1.14.0 dict /general/devices/Manual_lever_via_OmniPlex_v1.14.0/help 'A recording device e.g...' (33) [unicode] /general/devices/Manual_lever_via_OmniPlex_v1.14.0/namespace 'core' (4) [unicode] /general/devices/Manual_lever_via_OmniPlex_v1.14.0/neurodata_type 'Device' (6) [unicode] /general/devices/OmniPlex_v1.14.0 dict /general/devices/OmniPlex_v1.14.0/help 'A recording device e.g...' (33) [unicode] /general/devices/OmniPlex_v1.14.0/namespace 'core' (4) [unicode] /general/devices/OmniPlex_v1.14.0/neurodata_type 'Device' (6) [unicode] /general/experiment_description Node /general/experimenter Node /general/extracellular_ephys dict /general/extracellular_ephys/32ch-array dict /general/extracellular_ephys/32ch-array/description '32-channel_array' (16) [unicode] /general/extracellular_ephys/32ch-array/device link -> /general/devices/OmniPlex_v1.14.0 [SoftLink] /general/extracellular_ephys/32ch-array/help 'A physical grouping of...' (31) [unicode] /general/extracellular_ephys/32ch-array/location 'Pulvinar' (8) [unicode] /general/extracellular_ephys/32ch-array/namespace 'core' (4) [unicode] /general/extracellular_ephys/32ch-array/neurodata_type 'ElectrodeGroup' (14) [unicode] /general/extracellular_ephys/electrodes dict /general/extracellular_ephys/electrodes/colnames array (8,) [bytes104] /general/extracellular_ephys/electrodes/description 'metadata about extrace...' (39) [unicode] /general/extracellular_ephys/electrodes/filtering Node /general/extracellular_ephys/electrodes/group array (35,) [object] /general/extracellular_ephys/electrodes/group_name Node /general/extracellular_ephys/electrodes/help 'A column-centric table' (22) [unicode] /general/extracellular_ephys/electrodes/id array (35,) [int32] /general/extracellular_ephys/electrodes/imp array (35,) [float64] /general/extracellular_ephys/electrodes/location Node /general/extracellular_ephys/electrodes/namespace 'core' (4) [unicode] /general/extracellular_ephys/electrodes/neurodata_type 'DynamicTable' (12) [unicode] /general/extracellular_ephys/electrodes/x array (35,) [float64] /general/extracellular_ephys/electrodes/y array (35,) [float64] /general/extracellular_ephys/electrodes/z array (35,) [float64] /general/extracellular_ephys/test_electrode_1 dict /general/extracellular_ephys/test_electrode_1/description 'test_electrode_1' (16) [unicode] /general/extracellular_ephys/test_electrode_1/device link -> /general/devices/OmniPlex_v1.14.0 [SoftLink] /general/extracellular_ephys/test_electrode_1/help 'A physical grouping of...' (31) [unicode] /general/extracellular_ephys/test_electrode_1/location 'unknown' (7) [unicode] /general/extracellular_ephys/test_electrode_1/namespace 'core' (4) [unicode] /general/extracellular_ephys/test_electrode_1/neurodata_type 'ElectrodeGroup' (14) [unicode] /general/extracellular_ephys/test_electrode_2 dict /general/extracellular_ephys/test_electrode_2/description 'test_electrode_2' (16) [unicode] /general/extracellular_ephys/test_electrode_2/device link -> /general/devices/OmniPlex_v1.14.0 [SoftLink] /general/extracellular_ephys/test_electrode_2/help 'A physical grouping of...' (31) [unicode] /general/extracellular_ephys/test_electrode_2/location 'unknown' (7) [unicode] /general/extracellular_ephys/test_electrode_2/namespace 'core' (4) [unicode] /general/extracellular_ephys/test_electrode_2/neurodata_type 'ElectrodeGroup' (14) [unicode] /general/extracellular_ephys/test_electrode_3 dict /general/extracellular_ephys/test_electrode_3/description 'test_electrode_3' (16) [unicode] /general/extracellular_ephys/test_electrode_3/device link -> /general/devices/OmniPlex_v1.14.0 [SoftLink] /general/extracellular_ephys/test_electrode_3/help 'A physical grouping of...' (31) [unicode] /general/extracellular_ephys/test_electrode_3/location 'unknown' (7) [unicode] /general/extracellular_ephys/test_electrode_3/namespace 'core' (4) [unicode] /general/extracellular_ephys/test_electrode_3/neurodata_type 'ElectrodeGroup' (14) [unicode] /general/institution Node /general/lab Node /general/session_id Node /help 'an NWB:N file for stor...' (61) [unicode] /identifier Node /namespace 'core' (4) [unicode] /neurodata_type 'NWBFile' (7) [unicode] /nwb_version '2.0b' (4) [unicode] /processing dict /processing/ecephys dict /processing/ecephys/LFP_32ch-array dict /processing/ecephys/LFP_32ch-array/LFP_voltages_32ch-array dict (8) [...] /processing/ecephys/LFP_32ch-array/help 'LFP data from one or m...' (93) [unicode] /processing/ecephys/LFP_32ch-array/namespace 'core' (4) [unicode] /processing/ecephys/LFP_32ch-array/neurodata_type 'LFP' (3) [unicode] /processing/ecephys/LFP_test_electrode_1 dict /processing/ecephys/LFP_test_electrode_1/LFP_voltages_test_electrode_1 dict (8) [...] /processing/ecephys/LFP_test_electrode_1/help 'LFP data from one or m...' (93) [unicode] /processing/ecephys/LFP_test_electrode_1/namespace 'core' (4) [unicode] /processing/ecephys/LFP_test_electrode_1/neurodata_type 'LFP' (3) [unicode] /processing/ecephys/LFP_test_electrode_2 dict /processing/ecephys/LFP_test_electrode_2/LFP_voltages_test_electrode_2 dict (8) [...] /processing/ecephys/LFP_test_electrode_2/help 'LFP data from one or m...' (93) [unicode] /processing/ecephys/LFP_test_electrode_2/namespace 'core' (4) [unicode] /processing/ecephys/LFP_test_electrode_2/neurodata_type 'LFP' (3) [unicode] /processing/ecephys/LFP_test_electrode_3 dict /processing/ecephys/LFP_test_electrode_3/LFP_voltages_test_electrode_3 dict (8) [...] /processing/ecephys/LFP_test_electrode_3/help 'LFP data from one or m...' (93) [unicode] /processing/ecephys/LFP_test_electrode_3/namespace 'core' (4) [unicode] /processing/ecephys/LFP_test_electrode_3/neurodata_type 'LFP' (3) [unicode] /processing/ecephys/SPKC_32ch-array dict /processing/ecephys/SPKC_32ch-array/High-pass_filtered_voltages_32ch-array dict (8) [...] /processing/ecephys/SPKC_32ch-array/help 'Ephys data from one or...' (195) [unicode] /processing/ecephys/SPKC_32ch-array/namespace 'core' (4) [unicode] /processing/ecephys/SPKC_32ch-array/neurodata_type 'FilteredEphys' (13) [unicode] /processing/ecephys/SPKC_test_electrode_1 dict /processing/ecephys/SPKC_test_electrode_1/High-pass_filtered_voltages_test_ele... dict (8) [...] /processing/ecephys/SPKC_test_electrode_1/help 'Ephys data from one or...' (195) [unicode] /processing/ecephys/SPKC_test_electrode_1/namespace 'core' (4) [unicode] /processing/ecephys/SPKC_test_electrode_1/neurodata_type 'FilteredEphys' (13) [unicode] /processing/ecephys/SPKC_test_electrode_2 dict /processing/ecephys/SPKC_test_electrode_2/High-pass_filtered_voltages_test_ele... dict (8) [...] /processing/ecephys/SPKC_test_electrode_2/help 'Ephys data from one or...' (195) [unicode] /processing/ecephys/SPKC_test_electrode_2/namespace 'core' (4) [unicode] /processing/ecephys/SPKC_test_electrode_2/neurodata_type 'FilteredEphys' (13) [unicode] /processing/ecephys/SPKC_test_electrode_3 dict /processing/ecephys/SPKC_test_electrode_3/High-pass_filtered_voltages_test_ele... dict (8) [...] /processing/ecephys/SPKC_test_electrode_3/help 'Ephys data from one or...' (195) [unicode] /processing/ecephys/SPKC_test_electrode_3/namespace 'core' (4) [unicode] /processing/ecephys/SPKC_test_electrode_3/neurodata_type 'FilteredEphys' (13) [unicode] /processing/ecephys/description 'Processed extracellula...' (46) [unicode] /processing/ecephys/help 'A collection of analys...' (56) [unicode] /processing/ecephys/namespace 'core' (4) [unicode] /processing/ecephys/neurodata_type 'ProcessingModule' (16) [unicode] /session_description Node /session_start_time Node /stimulus dict /stimulus/presentation dict /stimulus/templates dict /timestamps_reference_time Node /units dict /units/Fs array (42,) [float64] /units/channel_id array (42,) [int32] /units/colnames array (12,) [bytes192] /units/description 'Autogenerated by NWBFile' (24) [unicode] /units/electrodes array (42,) [int32] /units/electrodes_index array (42,) [int32] /units/help 'Data about spiking units' (24) [unicode] /units/id array (42,) [int32] /units/is_unsorted array (42,) [int8] /units/namespace 'core' (4) [unicode] /units/neurodata_type 'Units' (5) [unicode] /units/num_samples array (42,) [int32] /units/num_spikes array (42,) [int32] /units/plx_sort_method array (42,) [int32] /units/plx_sort_range array (42, 2) [int32] /units/plx_sort_threshold array (42,) [float64] /units/pre_threshold_samples array (42,) [int32] /units/spike_times array (1798,) [float64] /units/spike_times_index array (42,) [int32] /units/waveforms array (1798, 56) [float64] /units/waveforms_index array (42,) [int32] ```

deepdish does not yet handle variable-length strings (shown as "Node" above) so this solution would require some work as well.

oruebel commented 4 years ago

I think that is pretty good, though the output reflects the organization of the PyNWB object and types, not the schema organization

Based on the original issue, I think this seems to be what this issue asking for. Diving into the HDF5 hierarchy itself, I think can be done with h5ls, h5dump, h5py, deepish etc., I think what would be most useful is a tool that knows about the NWB types and does ls for that. I think to a first degree, doing a print of NWBFile would be a good start and then things could evolve from there, e.g., different options to dive into the file. Some command-line options that would probably be useful:

oruebel commented 4 years ago

For a more interactive (rather than command-line-tool) option you can also look at https://github.com/NeurodataWithoutBorders/nwb-jupyter-widgets

yarikoptic commented 4 years ago

NWBFile level of detail, may be even by default listing just number of devices, electrode groups, sounds indeed the level. If output structure also gets 'file' field (first if ordered) , then

oruebel commented 4 years ago
* I even wondered about --diff-only option

To do diffs should probably be a separate tool. Also, doing diff will require additional development to compare NWBFile objects. While having functionality and tools to compare NWBFiles and do diffs will be useful, I think that should be a separate issue and tool.

yarikoptic commented 4 years ago

Yeap, for full blown diff comparing data - a separate tool (we have nib-diff in nibabel). Here I meant literally post analysis of what pynwb-ls about you return, and removing fields with identical value. But feel free to ignore for now ;-)