mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.7k stars 1.31k forks source link

preserve (select) header information from .ncs files in `RawNeuralynx` #12404

Closed KristijanArmeni closed 7 months ago

KristijanArmeni commented 8 months ago

Describe the new feature or enhancement

Each neuralynx .ncs file has a header with acquisition metadata (example below). Right now, read_raw_neuralynx() doesn't propagate any of that metadata downstream (e.g. in raw.info), other than sampling frequency (example below).

It seems to me, read_raw_neuralynx() could preserve the time of acquisition (--TimeCreated key in header) and any information about online filters if those were applied (--DSP* keys in header).

I'm happy to do this, but would need some guidance on which information from header would be useful to include and what the best way of doing so would be.

Describe your proposed implementation

Certain header properties can vary across .ncs channels (I think sampling freq, possibly also DSP params). So if any header information is to be included upon reading the dataset, it will first be checked that the property common across selected channels.

Describe possible alternatives

Not sure what alternatives would be here.

Additional context

Current raw.info

from mne.io import read_raw_neuralynx
from mne.datasets.testing import data_path

testing_path = data_path(download=False) / "neuralynx"
fname_patterns = ["*u*.ncs", "*3_gaps.ncs"]
raw = read_raw_neuralynx(fname=testing_path, preload=True, exclude_fname_patterns=fname_patterns)
raw.info

Out[1]: 
<Info | 7 non-empty values
 bads: []
 ch_names: LAHC1, LAHC2, LAHC3, xAIR1, xEKG1
 chs: 5 sEEG
 custom_ref_applied: False
 highpass: 0.0 Hz
 lowpass: 1000.0 Hz
 meas_date: unspecified
 nchan: 5
 projs: []
 sfreq: 2000.0 Hz
>

Example header (LAHC1.ncs)

######## Neuralynx Data File Header
-FileType NCS
-FileVersion 3.4
-FileUUID 663100aa-dc63-4458-8e21-80667b048cf5
-SessionUUID c6e1f50d-ff9c-40e4-9b9a-b43f627c74c8
-ProbeName
-OriginalFileName "E:\kristijan\2023-11-02_13-39-27\LAHC1.ncs"
-TimeCreated 2023/11/02 13:39:27
-TimeClosed 2023/11/02 13:42:05

-RecordSize 1044
-ApplicationName Pegasus "2.1.3 "
-AcquisitionSystem AcqSystem1 ATLAS
-ReferenceChannel "Source 01 Reference 1"
-SamplingFrequency 2000
-ADMaxValue 32767
-ADBitVolts 0.000000305175781250000006
-AcqEntName LAHC1
-NumADChannels 1
-ADChannel 8
-InputRange 10000
-InputInverted True

-DSPLowCutFilterEnabled True
-DspLowCutFrequency 0.1
-DspLowCutNumTaps 0
-DspLowCutFilterType DCO
-DSPHighCutFilterEnabled True
-DspHighCutFrequency 500
-DspHighCutNumTaps 256
-DspHighCutFilterType FIR
-DspDelayCompensation Enabled
-DspFilterDelay_<B5>s 3984
larsoner commented 8 months ago

Certain header properties can vary across .ncs channels (I think sampling freq, possibly also DSP params). So if any header information is to be included upon reading the dataset, it will first be checked that the property common across selected channels.

There is at least one other format where this is the case. IIRC what we did for example for info['lowpass'] and info['highpass'] is take the lowest and highest values, respectively, as this is the most conservative choice. (It would be dangerous to take the lowest highpass value for example because you could think all your signals were low-passed quite low, do decimation based on that, and get aliasing.)

So yes you could add some of these, take a look at the info fields to see what would match up:

https://mne.tools/stable/generated/mne.Info.html

For other things, I think it's okay just to document how people can use neo to get at the metadata separately.

KristijanArmeni commented 7 months ago

OK, thanks @larsoner! I think I'd preserve the "recording opened" (in "meas_date"), and the online filter frequencies. For example, if I look at the header for one file from the testing dateset:

>>> ncs_header["recording_opened"]
datetime.datetime(2023, 11, 2, 13, 39, 27)  # match this to info["meas_date"]
>>> ncs_header["DspLowCutFrequency"]        # set this to info["highpass"] (to the highest across files if non-uniform)
'0.1'
>>> ncs_header["DspHighCutFrequency"]       # set this to info["lowpass"] (to the lowest across files if non-uniform)
'500'

For the meas_date, it seems I need to convert to UTC. The only thing I am not sure is that for python > 3.11 you can use datetime.UTC vs in python < 3.11 there's datetime.timezone.utc to convert a datetime object to UTC. Not sure if there a general preference?

import datetime
>>> meas_utc = ncs_header["recording_opened"].astimezone(datetime.UTC)
>>> info.set_meas_date(meas_utc)
<Info | 7 non-empty values
 bads: []
 ch_names: LAHC1, LAHC2, LAHC3, xAIR1, xEKG1
 chs: 5 sEEG
 custom_ref_applied: False
 highpass: 0.0 Hz
 lowpass: 1000.0 Hz
 meas_date: 2023-11-02 17:39:27 UTC
 nchan: 5
 projs: []
 sfreq: 2000.0 Hz
>

For the filter, I can't seem to be able to find the mne.Info().filter() method. Am I missing something?

>>>info["highpass"] = ncs_header["DspLowCutFrequency"]
*** RuntimeError: highpass cannot be set directly. Please use method inst.filter() instead.

>>>info.filter
*** AttributeError: 'Info' object has no attribute 'filter'
larsoner commented 7 months ago

The hint is meant to say that users should do raw.filter(None, 40) for example or whatever. But that's really for end users. Devs who write data format readers can set these things with the info unlocked. See for example where this is being worked on for read_raw_edf https://github.com/mne-tools/mne-python/pull/12441/files

KristijanArmeni commented 7 months ago

Thanks! I think I see it now; I was missing the with info._unlock() context manager. Will follow up on this next week.

KristijanArmeni commented 7 months ago

Will follow up on this next week.

@larsoner PR in #12463