catalystneuro / neuroconv

Create NWB files by converting and combining neural data in proprietary formats and adding essential metadata.
https://neuroconv.readthedocs.io
BSD 3-Clause "New" or "Revised" License
50 stars 22 forks source link

[Feature]: Add multiple audio channels to nwb file from separate .wav files #637

Open magland opened 10 months ago

magland commented 10 months ago

What would you like to see added to NeuroConv?

I'd like to be able to add multiple audio channels from .wav files into an nwb file. With help from @CodyCBakerPhD I created the following script

import os
from datetime import datetime
from dateutil import tz

from neuroconv import ConverterPipe
from neuroconv.datainterfaces.behavior.audio.audiointerface import AudioInterface

audio_file_paths = [f'/path/to/concatenated_data/channel_{ch}.wav' for ch in range(9)]
path_to_save_nwbfile = '/path/to/dandi_upload/210683/sub-BE-REP-001/sub-BE-REP-001_ses-2023-10-30-experiment-9.nwb'
if os.path.exists(path_to_save_nwbfile):
    raise FileExistsError(f'File already exists at {path_to_save_nwbfile}')

interfaces = [AudioInterface(file_paths=[audio_file_path], verbose=True) for audio_file_path in audio_file_paths]

converter = ConverterPipe(data_interfaces=interfaces, verbose=True)

# Extract what metadata we can from the source files
metadata = converter.get_metadata()
# For data provenance we add the time zone information to the conversion
session_start_time = datetime(2020, 1, 1, 12, 30, 0, tzinfo=tz.gettz("US/Pacific"))
metadata["NWBFile"].update(session_start_time=session_start_time)

# Set the conversion options for each interface
conversion_options = {}
for interface_name in converter.data_interface_classes.keys():
    conversion_options[interface_name] = dict(
        write_as='acquisition'
    )

# Run the conversion
converter.run_conversion(nwbfile_path=path_to_save_nwbfile, metadata=metadata, conversion_options=conversion_options)

But as you might expect, there's a name collision as all objects try to get stored as acquisition/AcousticWaveformSeries

What's the right thing to do here?

See also: https://github.com/dandi/helpdesk/discussions/117

Is your feature request related to a problem?

No response

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

CodyCBakerPhD commented 10 months ago

OK, though we should no doubt think about a better solution in the long-term, something like the following should produce an equivalent effect

import os
from datetime import datetime
from dateutil import tz

from neuroconv import ConverterPipe
from neuroconv.datainterfaces.behavior.audio.audiointerface import AudioInterface

number_of_audio_files = 9
audio_file_paths = [f'/path/to/concatenated_data/channel_{ch}.wav' for ch in range(number_of_audio_files )]
path_to_save_nwbfile = '/path/to/dandi_upload/210683/sub-BE-REP-001/sub-BE-REP-001_ses-2023-10-30-experiment-9.nwb'

audio_interface = AudioInterface(file_paths=audio_file_paths, verbose=True)

# Assuming all audio streams are synchronized to the session start time
audio_interface.set_aligned_segment_starting_times(aligned_segment_starting_times=[0.0 for _ in range(number_of_audio_files )])

converter = ConverterPipe(data_interfaces=[audio_interface], verbose=True)

# Extract what metadata we can from the source files
metadata = converter.get_metadata()

# For data provenance we add the time zone information to the conversion
session_start_time = datetime(2020, 1, 1, 12, 30, 0, tzinfo=tz.gettz("US/Pacific"))
metadata["NWBFile"].update(session_start_time=session_start_time)

# To manually adjust names, you can look at contents of
# metadata["Behavior"]["Audio"]

# Set the conversion options for each interface
conversion_options = {}
for interface_name in converter.data_interface_classes.keys():
    if "Audio" in interface_name: 
        conversion_options[interface_name] = dict(
            write_as='acquisition'
        )

# Run the conversion
converter.run_conversion(nwbfile_path=path_to_save_nwbfile, metadata=metadata, conversion_options=conversion_options, overwrite=True)
CodyCBakerPhD commented 10 months ago

I guess what we would want as the proper solution is two-fold

a) unique identifier keys in metadata for each audio interface, in the event there are more than one b) an AudioConverter for multi-stream file separation