I'm wondering if it might be possible to issue a warning when data are loaded if they seem to have incorrect or unknown units? I'm asking because I've just run into a problem caused by loading data which were in uV, but MNE defaulted to assuming volts due a lack of info in the edf file.
Specifically, when loading and then re-exporting some edf files (as part of a BIDS re-structuring) I got the following error:
ValueError: '-1714830000' exceeds maximum field length: 11 > 8
This is due to the incorrect scaling causing edfio to throw an error when trying to write the file (when the min and max of the data were calculated the resulting values were longer than the 8 character limit).
Describe your proposed implementation
I guess there are a couple of options:
1) In relation to my specific case, just before handing things off to edfio the min and max of the data could be calculated and an error could be thrown if these exceed 8 characters, with an explicit recommendation to double check the scaling. Obviously edfio goes on to throw an error but its not particularly informative so it took me a while to figure out the underlying cause.
2) A more general option would be to try to infer when loading an edf file if the data seem to be off by a relevant factor (i.e. 1e6) and throw a warning?
3) Perhaps the simplest option would be just to give a warning if the edf file does not contain unit information and MNE has had to default to volts, then people can just go double-check whether this was appropriate. Or more strictly, you could make it impossible to load an edf without supplying unit information if none is present in the edf file.
Describe possible alternatives
N/A
Additional context
Full traceback in my specific case:
{
"name": "ValueError",
"message": "'-1714830000' exceeds maximum field length: 11 > 8",
"stack": "---------------------------------------------------------------------------
File <decorator-gen-250>:12, in export(self, fname, fmt, physical_range, add_ch_type, overwrite, verbose)
File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/mne/io/base.py:1784, in BaseRaw.export(self, fname, fmt, physical_range, add_ch_type, overwrite, verbose)
1757 \"\"\"Export Raw to external formats.
1758
1759 %(export_fmt_support_raw)s
(...)
1780 %(export_edf_note)s
1781 \"\"\"
1782 from ..export import export_raw
-> 1784 export_raw(
1785 fname,
1786 self,
1787 fmt,
1788 physical_range=physical_range,
1789 add_ch_type=add_ch_type,
1790 overwrite=overwrite,
1791 verbose=verbose,
1792 )
File <decorator-gen-458>:12, in export_raw(fname, raw, fmt, physical_range, add_ch_type, overwrite, verbose)
File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/mne/export/_export.py:75, in export_raw(fname, raw, fmt, physical_range, add_ch_type, overwrite, verbose)
72 elif fmt == \"edf\":
73 from ._edf import _export_raw
---> 75 _export_raw(fname, raw, physical_range, add_ch_type)
76 elif fmt == \"brainvision\":
77 from ._brainvision import _export_raw
File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/mne/export/_edf.py:144, in _export_raw(fname, raw, physical_range, add_ch_type)
140 pmax = ch_types_phys_max[ch_type]
141 prange = pmin, pmax
143 signals.append(
--> 144 EdfSignal(
145 data[idx],
146 out_sfreq,
147 label=signal_label,
148 transducer_type=\"\",
149 physical_dimension=\"\" if ch_type == \"stim\" else \"uV\",
150 physical_range=prange,
151 digital_range=(digital_min, digital_max),
152 prefiltering=filter_str_info,
153 )
154 )
156 # set patient info
157 subj_info = raw.info.get(\"subject_info\")
File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/edfio/edf_signal.py:124, in EdfSignal.__init__(self, data, sampling_frequency, label, transducer_type, physical_dimension, physical_range, digital_range, prefiltering)
122 if not np.all(np.isfinite(data)):
123 raise ValueError(\"Signal data must contain only finite values\")
--> 124 self._set_physical_range(physical_range, data)
125 self._set_digital_range(digital_range)
126 self._set_data(data)
File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/edfio/edf_signal.py:430, in EdfSignal._set_physical_range(self, physical_range, data)
426 if data_min < physical_range.min or data_max > physical_range.max:
427 raise ValueError(
428 f\"Signal range [{data_min}, {data_max}] out of physical range: [{physical_range.min}, {physical_range.max}]\"
429 )
--> 430 self._physical_min = encode_float(
431 _round_float_to_8_characters(physical_range.min, math.floor)
432 )
433 self._physical_max = encode_float(
434 _round_float_to_8_characters(physical_range.max, math.ceil)
435 )
File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/edfio/_header_field.py:42, in encode_float(value)
40 if float(value).is_integer():
41 value = int(value)
---> 42 return encode_str(str(value), 8)
File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/edfio/_header_field.py:23, in encode_str(value, length)
21 def encode_str(value: str, length: int) -> bytes:
22 if len(value) > length:
---> 23 raise ValueError(
24 f\"{value!r} exceeds maximum field length: {len(value)} > {length}\"
25 )
26 if not value.isprintable():
27 raise ValueError(f\"{value} contains non-printable characters\")
ValueError: '-1714830000' exceeds maximum field length: 11 > 8"
}
Describe the new feature or enhancement
I'm wondering if it might be possible to issue a warning when data are loaded if they seem to have incorrect or unknown units? I'm asking because I've just run into a problem caused by loading data which were in uV, but MNE defaulted to assuming volts due a lack of info in the edf file.
Specifically, when loading and then re-exporting some edf files (as part of a BIDS re-structuring) I got the following error:
This is due to the incorrect scaling causing edfio to throw an error when trying to write the file (when the min and max of the data were calculated the resulting values were longer than the 8 character limit).
Describe your proposed implementation
I guess there are a couple of options:
1) In relation to my specific case, just before handing things off to edfio the min and max of the data could be calculated and an error could be thrown if these exceed 8 characters, with an explicit recommendation to double check the scaling. Obviously edfio goes on to throw an error but its not particularly informative so it took me a while to figure out the underlying cause.
2) A more general option would be to try to infer when loading an edf file if the data seem to be off by a relevant factor (i.e. 1e6) and throw a warning?
3) Perhaps the simplest option would be just to give a warning if the edf file does not contain unit information and MNE has had to default to volts, then people can just go double-check whether this was appropriate. Or more strictly, you could make it impossible to load an edf without supplying unit information if none is present in the edf file.
Describe possible alternatives
N/A
Additional context
Full traceback in my specific case: