mne-tools / mne-bids

MNE-BIDS is a Python package that allows you to read and write BIDS-compatible datasets with the help of MNE-Python.
https://mne.tools/mne-bids/
BSD 3-Clause "New" or "Revised" License
132 stars 87 forks source link

write_raw_bids() accepts (and silently ignores / alters) unsuitable extension #1041

Open hoechenberger opened 2 years ago

hoechenberger commented 2 years ago

When trying to write EEG data to a FIFF file, write_raw_bids() happily accepts a respective BIDSPath, but actually writes the data to a BV file.

It should at least warn, or even better, raise an exception.

MWE:

# %%
from pathlib import Path
import mne
import mne_bids

# Load dat
ssvep_folder = mne.datasets.ssvep.data_path()
ssvep_data_raw_path = (ssvep_folder / 'sub-02' / 'ses-01' / 'eeg' /
                       'sub-02_ses-01_task-ssvep_eeg.vhdr')
ssvep_raw = mne.io.read_raw_brainvision(ssvep_data_raw_path, verbose=False)

# Create a BIDS dataset
root = Path('/tmp/bids-test')
bp = mne_bids.BIDSPath(
    subject='02',
    session='01',
    task='ssvep',
    suffix='meg',
    extension='.fif',
    datatype='eeg',
    root=root
)
bp_written = mne_bids.write_raw_bids(
    raw=ssvep_raw,
    bids_path=bp,
    overwrite=True,
)
print(repr(bp_written))

produces:

BIDSPath(
root: /tmp/bids-test
datatype: eeg
basename: sub-02_ses-01_task-ssvep_eeg.vhdr)
sappelhoff commented 2 years ago

agreed, this looks like a bug to me. I think it should "autoconvert" to FIF ... or raise an exception demanding format to be explicitly set to FIF.

agramfort commented 2 years ago

what behavior would you expect? I would say it should just write to .fif and not warn.

Message ID: @.***>

hoechenberger commented 2 years ago

I maybe didn't explain it well.

The input I have is FIFF.

The data type is eeg.

BIDS mandates that EEG data be stored in BrainVision format; FIFF is not allowed.

MNE-BIDS silently converts my data from FIFF to BV, even though the target BIDSPath I supply explicitly states an extension of .fif

So in my perception, MNE-BIDS simply overrules what I explicitly asked it to do (write the data to FIFF), in order to conform to BIDS.

Instead, it should raise an exception, stating that EEG data cannot be written to FIFF as it violates the standard.

agramfort commented 2 years ago

let me give you the following use case. I have combined MEG/EEG data and I just want to analyse the EEG data. Take ds117 for example. Original data is in FIF and I don't see why derivatives eg filtered raw data after picking the EEG channels shouldn't stay in FIF. Does it make sense?

Message ID: @.***>

hoechenberger commented 2 years ago

Yes I totally agree, and I would say we should amend the BIDS standard to allow for FIFF for EEG data as well. But it's currently not the case.

As for derivatives, that further complicates things, yes!!

But currently it is how it is: BIDS doesn't allow FIFF for EEG, at least not for raw data. Unless I'm missing something in the spec -- @sappelhoff?

sappelhoff commented 2 years ago

Original data is in FIF and I don't see why derivatives eg filtered raw data after picking the EEG channels shouldn't stay in FIF. Does it make sense?

agreed. And I think this is in line with the spec --> EEG data that was recorded together with MEG data can be stored under an MEG datatype.

If you want to completely rip out the EEG data from the MEG format and only deal with EEG, then I think it's fair to require that you use BIDS conventions for EEG and convert your data and then continue.

agramfort commented 2 years ago

I would say we keep the file as MEG datatype and store things in FIF for now.

But I see the problem we are facing with the pipeline that stores all derivatives in FIF...

Message ID: @.***>

hoechenberger commented 2 years ago

I've been trying to address this in #1053 but I ran into an issue:

It appears the format kwarg controls the output format. But this then gets confusing as the user can also specify a filename extension in the provided BIDSPath. In this case, write_raw_bids() will, by default (format='auto') simply replace any provided extension by the one it sees fit for the data format to be written. I think this is very confusing behavior.

My proposal would be to get rid of the format parameter of write_raw_bids() altogether and only rely on the provided extension of the BIDSPath.

agramfort commented 2 years ago

can we make format auto default to use the extension in bidspath and if not provided default to some default of datatype?

Message ID: @.***>

hoechenberger commented 2 years ago

Maybe that would be a good idea. Currently it infers the format based in the data, for example if there are MEG channels it will pick meg