fieldtrip / fieldtrip

The MATLAB toolbox for MEG, EEG and iEEG analysis
http://www.fieldtriptoolbox.org
GNU General Public License v3.0
839 stars 719 forks source link

Problem with importing MEG dataset ds002001 with File-IO #2354

Closed arnodelorme closed 9 months ago

arnodelorme commented 10 months ago

Imported datasets are empty. https://nemar.org/dataexplorer/detail?dataset_id=ds002001 This is the message that is returned.

found matching BIDS sidecar '/expanse/projects/nemar/openneuro/ds002001/sub-0001/ses-20140502/meg/sub-0001_ses-20140502_task-rivalry_run-03_meg.json'
reading '/expanse/projects/nemar/openneuro/ds002001/sub-0001/ses-20140502/meg/sub-0001_ses-20140502_task-rivalry_run-03_meg.json'

readCTFds: Data set error : size of meg4 file(s)
        1362551304 bytes (from dir command)
        1168128008 bytes (from res4 file)

undoing the G3BR balancing for the gradiometer definition
Warning: copying input chantype to montage 

Warning: copying input chanunit to montage 

getCTFdata: File sub-0001_ses-20140502_task-rivalry_run-03_meg.meg4 does not contain a whole number of trials.
            File size (bytes)=1362551304    header=8 bytes   trial size (bytes)=194688000

Can it be confirmed that the datasets are corrupted? The ctfimport plugin of EEGLAB can import data on some of theses datasets.

robertoostenveld commented 10 months ago

The meg4 file holds the actual channel-level data and consists of a small header followed by blocks of data (each block being nchan*nsamples, not multiplexed but channel-wise). The warning message suggests that the last block is not complete, which may happen if acquisition is aborted due to a software crash, or which may happen due to an incomplete data transfer.

In any case, it should technically be possible to load all blocks except the last (incomplete) one, or to fix the file by cutting off the last incomplete block. But it might be that the software does not deal with it properly. I am downloading the data now to give it a try.

Please note that the sub-0001_ses-20140502_scans.tsv is incomplete, you may want to fix that.

robertoostenveld commented 10 months ago

Looking at what the code reports in your initial issue: the file is 1362551304 bytes, 8 of which are for the header. Each block is 194688000 bytes. The data has 338 channels, 1200 Hz sampling rate, and each sample is 4 bytes. One block is therefore 194688000/(43381200)=120 seconds. The file contains (1362551304-8)/194688000=6.9986 blocks, so the last block is indeed not complete.

You could trim the file to 8+6*194688000=1168128008 bytes, for example using dd on linux or macos. This is consistent with what the res4 binary header file reports as the meg4 file size.

robertoostenveld commented 10 months ago

I can confirm that the reading fails.

dat = ft_read_data('sub-0001_ses-20140502_task-rivalry_run-03_meg.ds')
getCTFdata: File sub-0001_ses-20140502_task-rivalry_run-03_meg.meg4 does not contain a whole number of trials.
            File size (bytes)=1362551304    header=8 bytes   trial size (bytes)=194688000

dat =
     []

Note that the lack of support is not in the FieldTrip code, m but in the underlying CTF code.

After I do the following:

cd sub-0001_ses-20140502_task-rivalry_run-03_meg.ds/
rm sub-0001_ses-20140502_task-rivalry_run-03_meg.meg4
dd if=../../../../.git/annex/objects/Q4/13/MD5E-s1362551304--a428cab08f000fe14e8cbcb1e0d759ff.meg4/MD5E-s1362551304--a428cab08f000fe14e8cbcb1e0d759ff.meg4 of=sub-0001_ses-20140502_task-rivalry_run-03_meg.meg4 bs=1 count=1168128008

the file reads fine. So trimming it to the expected size according to the res4 file solves it.

I hope this helps, Robert

arnodelorme commented 9 months ago

Thank you Robert, it does help. I will reference this bug with the dataset.