TheChymera / neuralynx_nwb

Neuralynx to NWB converstion scripts (ideally to be upstreamed)
0 stars 0 forks source link

`session_start_time` not found by `NeuralynxRecordingExtractor` #7

Closed TheChymera closed 1 year ago

TheChymera commented 1 year ago

@CodyCBakerPhD Starting a separate issue for this.

So the *.ncs files in the directory we're attempting to convert have the recording start time in their header, however, when I try to convert it via our current code I get an AssertionError: 'session_start_time' was not found in metadata['NWBFile']!.

[dark]~/src/neuralynx_nwb ❱ python -c "from neuralynx_nwb import newconvert; newconvert.reposit_data()"
/home/chymera/.local/share/datalad/vStr_phase_stim/M235/M235-2021-07-16/M235_2021_07_16_Expkeys.m
^B[^[[5~Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/chymera/src/neuralynx_nwb/neuralynx_nwb/newconvert.py", line 41, in reposit_data
    write_recording(recording=recording, nwbfile_path=out_file)
  File "/usr/lib/python3.10/site-packages/neuroconv/tools/spikeinterface/spikeinterface.py", line 1058, in write_recording
    with make_or_load_nwbfile(
  File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "/usr/lib/python3.10/site-packages/neuroconv/tools/nwb_helpers.py", line 173, in make_or_load_nwbfile
    raise e
  File "/usr/lib/python3.10/site-packages/neuroconv/tools/nwb_helpers.py", line 169, in make_or_load_nwbfile
    nwbfile = make_nwbfile_from_metadata(metadata=metadata)
  File "/usr/lib/python3.10/site-packages/neuroconv/tools/nwb_helpers.py", line 58, in make_nwbfile_from_metadata
    assert "session_start_time" in nwbfile_kwargs, (
AssertionError: 'session_start_time' was not found in metadata['NWBFile']! Please add the correct start time of the session in ISO8601 format (%Y-%m-%dT%H:%M:%S) to this key of the metadata.
[dark]~/src/neuralynx_nwb ❱ cat neuralynx_nwb/newconvert.py
import os

from datetime import datetime

from neuroconv.datainterfaces import NeuralynxRecordingInterface
from neuroconv.tools.spikeinterface import write_recording
from spikeinterface.extractors import NeuralynxRecordingExtractor

def lab_metadata(in_dir):
    session_name = [i for i in in_dir.split("/") if i][-1]
    session_name = session_name.replace('-', "_")
    exp_metadata = os.path.join(in_dir, session_name+"_Expkeys.m")
    print(exp_metadata)

def reposit_data(
    data_dir='~/.local/share/datalad/',
    data_selection='vStr_phase_stim/M235/M235-2021-07-16/',
    lab_name='MVDMLab',
    institution='Dartmouth College',
    keywords=[
        'DANDI Pilot',
        ],
    experimenter='Manish Mohapatra',
    experiment_description='...',
    debug=True,
    session_description='Extracellular ephys recording in the ventral Striatum',
    keep_original_times=True,
    output_filename='neuralynx_nwb_testfile',
    ):

    data_dir = os.path.abspath(os.path.expanduser(data_dir))
    session_dir = os.path.join(data_dir, data_selection)
    now = datetime.today().strftime('%Y%m%d%H%M%S')
    out_file = session_dir.rstrip("/") + f"-{now}.nwb"
    lab_metadata(session_dir)
    #interface = NeuralynxRecordingInterface(folder_path=session_dir, verbose=False)
    recording = NeuralynxRecordingExtractor(folder_path=session_dir, stream_id='0')
    #recording["NWBFile"].update(session_start_time=session_start_time)
    write_recording(recording=recording, nwbfile_path=out_file)
TheChymera commented 1 year ago

Is there any way to maybe get this parsed by the Extractor? We could jury-rig it manually, but since these are different files it might be best done in a more coordinated fashion. Apparently the files can be understood by head:

[dark]~/.local/share/datalad/vStr_phase_stim/M235/M235-2021-07-16 ❱ head -10 M235-2021-07-16-TT06.ntt
######## Neuralynx Data File Header
-FileType Spike
-FileVersion 3.4
-FileUUID c2ee5195-9038-42d4-ab44-ba1160050a52
-SessionUUID 5232512a-21a2-44d4-a6ee-ae31fae2db18
-ProbeName
-OriginalFileName "D:\CheetahData\2021-07-16_09-37-42\TT6.ntt"
-TimeCreated 2021/07/16 09:37:55
-TimeClosed 2021/07/16 12:22:53

[dark]~/.local/share/datalad/vStr_phase_stim/M235/M235-2021-07-16 ❱ head -10 CSC11.ncs
######## Neuralynx Data File Header
-FileType NCS
-FileVersion 3.4
-FileUUID 497927c5-dc15-4822-910a-af21e3b112fb
-SessionUUID 5232512a-21a2-44d4-a6ee-ae31fae2db18
-ProbeName
-OriginalFileName "D:\CheetahData\2021-07-16_09-37-42\CSC11.ncs"
-TimeCreated 2021/07/16 09:37:55
-TimeClosed 2021/07/16 12:22:53
TheChymera commented 1 year ago

The relevant line is TimeCreated

CodyCBakerPhD commented 1 year ago

@TheChymera Sorry for the delay, things have been hectic in between conferences...

So that is exactly one of those fields that would have hopefully been included automatically when using the interface

Unfortunately, there is no way for the extractor itself to communicate that information, which is why the interfaces exist in the first place. For now, I'd suggest jury-rigging it to retrieve that value from the metadata file and passing the information in the following way

I've tried a naive fix to the NeuralynxRecordingInterface on this PR: https://github.com/catalystneuro/neuroconv/pull/369, feel free to give it a try to see if it works for you

If not, the manual way of getting this to work with your current extractor-based setup would be

metadata = defaultdict(dict)
metadata["NWBFile"]["session_start_time"] = start_time_grabbed_from_metadata_file

on a side note, our Neuralynx metadata fetcher developed by @JuliaSprenger actually seems to pull this from the recording_opened field so it would be interesting to know if your version of the format has it encoded some other way

TheChymera commented 1 year ago

@CodyCBakerPhD this is how we ended up doing it on our end https://github.com/TheChymera/neuralynx_nwb/blob/79db8074ef7c638d78201854fa388c678d019858/neuralynx_nwb/newconvert.py#L18-L55 — quite inelegant since it still relies on head, as the encoding-autodetection is tricky to implement in Python. Do you think upstream could do it better? :3

CodyCBakerPhD commented 1 year ago

If nothing in the files themselves indicates the session start time, what we've often done is simply either

(a) make a CSV or similar file where the information is manually entered, then you read and iterate over that table while running the conversion

or similarly or even in addition

(b) if exact start time is not known, but the date is (usually as a part of the folder names), set the time to be midnight on that day and hard code in some way or another the starting dates that are known

I would not, in general, recommend looking at the time at which the file was created on the system since that (at least to my experience) is not always accurate and can change if you copy/paste the file or transfer it from one system to another

TheChymera commented 1 year ago

I would not, in general, recommend looking at the time at which the file was created on the system

The start time reference is from the neuralynx file header, not from the file timestamp, so it reflects when the measurement was initiated and won't change when files are copied.

CodyCBakerPhD commented 1 year ago

I see, I didn't understand what head was doing. Never seen it used for this purpose before. Was there really no way to say, open the file (even as binary) and decode the text from the first X number of lines? As opposed to having to deploy an entire subprocess to run a CLI call

TheChymera commented 1 year ago

Sorted out, albeit inelegantly in https://github.com/TheChymera/neuralynx_nwb/commit/79db8074ef7c638d78201854fa388c678d019858