mne-tools / mne-bids

MNE-BIDS is a Python package that allows you to read and write BIDS-compatible datasets with the help of MNE-Python.
https://mne.tools/mne-bids/
BSD 3-Clause "New" or "Revised" License
131 stars 85 forks source link

mne_bids.path.BIDSPath.fpath ignores value of 'check' #1123

Open kaare-mikkelsen opened 1 year ago

kaare-mikkelsen commented 1 year ago

Description of the problem

I am structuring my derivative folders according to BIDS structure, and using mne-bids to identify files in there, even if the data types are no longer standard BIDS types.

Steps to reproduce

Python For example:

This correctly identifies 4 files for me: hypnoFiles = mb.find_matching_paths(root=hypnoDerivPath,extensions='.npz',subjects=subject)

However, calling fpath on any of these files throws up an error, even when check=False:

hypnoData = []
for hypnoFile in hypnoFiles:
    hypnoFile.update(check=False)
    print(hypnoFile.fpath)

Expected results

I expect hypnoFile.fpath to print the location of the file when check=False.

Actual results

This is the error:

ValueError                                Traceback (most recent call last)
c:\Users\au207178\OneDrive - Aarhus universitet\forskning\sleepInOrbit\pipeline1.py in line 15
     162 for hypnoFile in hypnoFiles:
     163     hypnoFile.update(check=False)
---> 164     print(hypnoFile.fpath)
     166 #extract sleep features
     167 sleepFeatures=pd.DataFrame()

File c:\Users\au207178\Anaconda3\envs\cuda11\lib\site-packages\mne_bids\path.py:600, in BIDSPath.fpath(self)
    592 else:
    593     # if suffix and/or extension is missing, and root is
    594     # not None, then BIDSPath will infer the dataset
    595     # else, return the relative path with the basename
    596     if (self.suffix is None or self.extension is None) and \
    597             self.root is not None:
    598         # get matching BIDSPaths inside the bids root
    599         matching_paths = \
--> 600             _get_matching_bidspaths_from_filesystem(self)
    602         # FIXME This will break
    603         # FIXME e.g. with FIFF data split across multiple files.
    604         # if extension is not specified and no unique file path
    605         # return filepath of the actual dataset for MEG/EEG/iEEG data
    606         if self.suffix is None or self.suffix in ALLOWED_DATATYPES:
    607             # now only use valid datatype extension
...
   1885     msg = (f'Found data of more than one recording datatype. Please '
   1886            f'pass the `suffix` parameter to specify which data to load. '
   1887            f'Found the following modalitiess: {modalities}')

ValueError: No electrophysiological data found.

Additional information

However, all information for locating the file is clearly available to fpath:

print(os.path.join(hypnoFile.directory,hypnoFile.basename+hypnoFile.extension))

Works fine as a workaraound.

agramfort commented 1 year ago

how can we replicate this?

kaare-mikkelsen commented 1 year ago

Make a dummy dataset with a single file of a non-BIDS compliant type:

root='dummyBidsSet'

Filetree: \sub-001\ses-001\sub-001_ses-001_task-sleep_desc-hypnogram.npz

Run:

import mne_bids as mb
oneFile=mb.find_matching_paths(root='pathToDummyBidsSet')
oneFile[0].update(check=False)
print(oneFile)
oneFile[0].fpath

This returns the following (for me):

[BIDSPath( root: C:/Users/au207178/OneDrive - Aarhus universitet/forskning/sleepInOrbit/dummyBidsSet datatype: None basename: sub-001_ses-001_task-sleep_desc-hypnogram)]

ValueError Traceback (most recent call last) c:\Users\au207178\OneDrive - Aarhus universitet\forskning\sleepInOrbit\pipeline1.py in line 7 287 oneFile[0].update(check=False) 288 print(oneFile) ----> 289 oneFile[0].fpath

File c:\Users\au207178\Anaconda3\envs\cuda11\lib\site-packages\mne_bids\path.py:600, in BIDSPath.fpath(self) 592 else: 593 # if suffix and/or extension is missing, and root is 594 # not None, then BIDSPath will infer the dataset 595 # else, return the relative path with the basename 596 if (self.suffix is None or self.extension is None) and \ 597 self.root is not None: 598 # get matching BIDSPaths inside the bids root 599 matching_paths = \ --> 600 _get_matching_bidspaths_from_filesystem(self) 602 # FIXME This will break 603 # FIXME e.g. with FIFF data split across multiple files. 604 # if extension is not specified and no unique file path 605 # return filepath of the actual dataset for MEG/EEG/iEEG data 606 if self.suffix is None or self.suffix in ALLOWED_DATATYPES: 607 # now only use valid datatype extension

File c:\Users\au207178\Anaconda3\envs\cuda11\lib\site-packages\mne_bids\path.py:1059, in _get_matching_bidspaths_from_filesystem(bids_path) ... 1885 msg = (f'Found data of more than one recording datatype. Please ' 1886 f'pass the suffix parameter to specify which data to load. ' 1887 f'Found the following modalitiess: {modalities}')

ValueError: No electrophysiological data found.

However,

print(os.path.join(oneFile[0].directory,oneFile[0].basename+oneFile[0].extension))

describes the full path to the file in question: (windowsbit)\dummyBidsSet\sub-001\ses-001\sub-001_ses-001_task-sleep_desc-hypnogram.npz

agramfort commented 1 year ago

I can replicate with this self contained code snippet:

from pathlib import Path
import numpy as np
import mne_bids

A = np.random.rand(5,5)
fname = Path("./bids_data/sub-001/ses-001/sub-001_ses-001_task-sleep_desc-hypnogram.npz")
fname.parent.mkdir(parents=True, exist_ok=True)
np.savez_compressed(fname, A=A)

root = './bids_data'
bp = mne_bids.find_matching_paths(root=root)[0]
bp.update(check=False)
print(bp.basename + bp.extension)

bp.fpath  # Crash for no good reason

I agree this should work