mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.73k stars 1.32k forks source link

Add warning if data have incorrect or unknown units #12964

Open bootstrapbill opened 3 days ago

bootstrapbill commented 3 days ago

Describe the new feature or enhancement

I'm wondering if it might be possible to issue a warning when data are loaded if they seem to have incorrect or unknown units? I'm asking because I've just run into a problem caused by loading data which were in uV, but MNE defaulted to assuming volts due a lack of info in the edf file.

Specifically, when loading and then re-exporting some edf files (as part of a BIDS re-structuring) I got the following error:

ValueError: '-1714830000' exceeds maximum field length: 11 > 8

This is due to the incorrect scaling causing edfio to throw an error when trying to write the file (when the min and max of the data were calculated the resulting values were longer than the 8 character limit).

Describe your proposed implementation

I guess there are a couple of options:

1) In relation to my specific case, just before handing things off to edfio the min and max of the data could be calculated and an error could be thrown if these exceed 8 characters, with an explicit recommendation to double check the scaling. Obviously edfio goes on to throw an error but its not particularly informative so it took me a while to figure out the underlying cause.

2) A more general option would be to try to infer when loading an edf file if the data seem to be off by a relevant factor (i.e. 1e6) and throw a warning?

3) Perhaps the simplest option would be just to give a warning if the edf file does not contain unit information and MNE has had to default to volts, then people can just go double-check whether this was appropriate. Or more strictly, you could make it impossible to load an edf without supplying unit information if none is present in the edf file.

Describe possible alternatives

N/A

Additional context

Full traceback in my specific case:

{
    "name": "ValueError",
    "message": "'-1714830000' exceeds maximum field length: 11 > 8",
    "stack": "---------------------------------------------------------------------------

File <decorator-gen-250>:12, in export(self, fname, fmt, physical_range, add_ch_type, overwrite, verbose)

File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/mne/io/base.py:1784, in BaseRaw.export(self, fname, fmt, physical_range, add_ch_type, overwrite, verbose)
   1757 \"\"\"Export Raw to external formats.
   1758 
   1759 %(export_fmt_support_raw)s
   (...)
   1780 %(export_edf_note)s
   1781 \"\"\"
   1782 from ..export import export_raw
-> 1784 export_raw(
   1785     fname,
   1786     self,
   1787     fmt,
   1788     physical_range=physical_range,
   1789     add_ch_type=add_ch_type,
   1790     overwrite=overwrite,
   1791     verbose=verbose,
   1792 )

File <decorator-gen-458>:12, in export_raw(fname, raw, fmt, physical_range, add_ch_type, overwrite, verbose)

File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/mne/export/_export.py:75, in export_raw(fname, raw, fmt, physical_range, add_ch_type, overwrite, verbose)
     72 elif fmt == \"edf\":
     73     from ._edf import _export_raw
---> 75     _export_raw(fname, raw, physical_range, add_ch_type)
     76 elif fmt == \"brainvision\":
     77     from ._brainvision import _export_raw

File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/mne/export/_edf.py:144, in _export_raw(fname, raw, physical_range, add_ch_type)
    140         pmax = ch_types_phys_max[ch_type]
    141         prange = pmin, pmax
    143     signals.append(
--> 144         EdfSignal(
    145             data[idx],
    146             out_sfreq,
    147             label=signal_label,
    148             transducer_type=\"\",
    149             physical_dimension=\"\" if ch_type == \"stim\" else \"uV\",
    150             physical_range=prange,
    151             digital_range=(digital_min, digital_max),
    152             prefiltering=filter_str_info,
    153         )
    154     )
    156 # set patient info
    157 subj_info = raw.info.get(\"subject_info\")

File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/edfio/edf_signal.py:124, in EdfSignal.__init__(self, data, sampling_frequency, label, transducer_type, physical_dimension, physical_range, digital_range, prefiltering)
    122 if not np.all(np.isfinite(data)):
    123     raise ValueError(\"Signal data must contain only finite values\")
--> 124 self._set_physical_range(physical_range, data)
    125 self._set_digital_range(digital_range)
    126 self._set_data(data)

File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/edfio/edf_signal.py:430, in EdfSignal._set_physical_range(self, physical_range, data)
    426     if data_min < physical_range.min or data_max > physical_range.max:
    427         raise ValueError(
    428             f\"Signal range [{data_min}, {data_max}] out of physical range: [{physical_range.min}, {physical_range.max}]\"
    429         )
--> 430 self._physical_min = encode_float(
    431     _round_float_to_8_characters(physical_range.min, math.floor)
    432 )
    433 self._physical_max = encode_float(
    434     _round_float_to_8_characters(physical_range.max, math.ceil)
    435 )

File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/edfio/_header_field.py:42, in encode_float(value)
     40 if float(value).is_integer():
     41     value = int(value)
---> 42 return encode_str(str(value), 8)

File ~/opt/anaconda3/envs/mne/lib/python3.11/site-packages/edfio/_header_field.py:23, in encode_str(value, length)
     21 def encode_str(value: str, length: int) -> bytes:
     22     if len(value) > length:
---> 23         raise ValueError(
     24             f\"{value!r} exceeds maximum field length: {len(value)} > {length}\"
     25         )
     26     if not value.isprintable():
     27         raise ValueError(f\"{value} contains non-printable characters\")

ValueError: '-1714830000' exceeds maximum field length: 11 > 8"
}
agramfort commented 3 days ago

we could add warnings for 2 and 3 and suggest heavily to specific the unit to avoid guessing.

Message ID: @.***>