file formats for digitization points in BIDS-MEG specification

jasmainak commented 5 years ago

There are two issues regarding file formats in the BIDS-MEG specification:

Currently, BIDS-MEG only "recommends" certain file formats in Appendix VI. This does not necessarily restrict the file formats in the sense that it is not a "must" or a "should". Should this be turned into a "must" to be consistent with validator?
The headshape file formats are supposed to be in the "specific format of the 3-D digitizer’s manufacturer", but this information is completely missing from Appendix VI. What are the file formats that we should support here? One suggestion is to start from the list here

Relevant discussion thread: https://github.com/bids-standard/bids-validator/pull/585

cc @chrisfilo @teonbrooks @sappelhoff @robertoostenveld @monkeyman192

Thoughts welcome!

chrisgorgo commented 5 years ago

+1 for specifying a limited number of well-described file formats. Allowing any file format makes it impossible to write software compatible with BIDS inputs.

sappelhoff commented 5 years ago

+1 for specifying a limited number of well-described file formats.

+1 but we need input from many people to get a representative list of well-described formats. I have questions about such formats out there on several google doc comments but so far - no responses :-)

robertoostenveld commented 5 years ago

[no solution here, just background info]

Many labs are combining the primary measurement device (the MEG system) with secondary measurement devices (such as the 3D digitizer) from another company. The number of hardware companies is still relatively limited, but the api of the (OEM) hardware is (reasonably) open, resulting in other companies making products that include hard+software, and hence a file format. Most formats I am aware of are ascii with some form of tabular content, but that is not sufficiently restrictive to implement a parser.

The issue not only applies to MEG headshapes, but also to EEG electrode positions.

chrisgorgo commented 5 years ago

Maybe it will be easier to come to a conclusion if we split the issue into two: raw data and headshapes.

On Tue, Oct 2, 2018, 4:02 AM Robert Oostenveld notifications@github.com wrote:

[no solution here, just background info]

Many labs are combining the primary measurement device (the MEG system) with secondary measurement devices (such as the 3D digitizer) from another company. The number of hardware companies is still relatively limited, but the api of the (OEM) hardware is (reasonably) open, resulting in other companies making products that include hard+software, and hence a file format. Most formats I am aware of are ascii with some form of tabular content, but that is not sufficiently restrictive to implement a parser.

The issue not only applies to MEG headshapes, but also to EEG electrode positions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bids-standard/bids-specification/issues/20#issuecomment-426233329, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOkp0fClZfuG_AZCN_MmCbeMi6vx8t2ks5ug0fhgaJpZM4XDU2I .

teonbrooks commented 5 years ago

I agree. I think we can have a stronger assertion for the data format, and we should likely recommend the headshape.

jasmainak commented 5 years ago

So, is the consensus to have one format or many but limited? In either case, this would entail conversion on the end of the user -- potentially harmonizing the coordinate system and the units. I do not know of any tools that can currently convert from one headshape format to the other (unless @teonbrooks already knows something). That is not to say it cannot be created :)

robertoostenveld commented 5 years ago

I would say that headshapes recorded with the original MEG system manufacturer's software (i.e. limited set) should be allowed, whereas others (i.e. potentially unlimited set) should not be allowed. I only know the case for Elekta (stored in the fif file) and CTF (stored in a *.pos file).

I checked the present bids-examples and all seem to comply, except for https://github.com/bids-standard/bids-examples/blob/master/ds000117/sub-01/ses-meg/meg/sub-01_ses-meg_headshape.pos which is a pos file that accompanies an elekta fif dataset. It is not in CTF pos format, so I don't know which software created the file.

sappelhoff commented 5 years ago

I would say that headshapes recorded with the original MEG system manufacturer's software (i.e. limited set) should be allowed, whereas others (i.e. potentially unlimited set) should not be allowed.

+1 ... are the MEG system manufacturers' file formats well enough reported, so that other file formats can be reformatted to comply with these new BIDS requirements? If not, can we somehow put documentations out there?

On the same note but regarding EEG:

And then we do not accept headshape files for EEG? Currently, they are still in our BEP, see here.

for iEEG, headshape files will not be included, as @DoraHermes told me.

jasmainak commented 5 years ago

okay, so here is what I found on carefully reading the spec + associated links. Does this align with what people expect (feel free to edit or comment)?

To recall, the original reason we had this issue was that @monkeyman192 found .elp and .hsp files for KIT data. Is this only an issue with KIT systems or do we have such mismatches for other systems too? @teonbrooks you have some experience working with KIT systems. Could you comment?

teonbrooks commented 5 years ago

For this issue, I think we should create a separate issue and pr for hardware data format and lead the discussion there. @jasmainak, I would suggest that this issue be renamed to digitization data.

for .elp and .hsp, these are not native KIT format files as @robertoostenveld mentioned here. It is a format that has been supported in tandem with KIT in mne-c. KIT does not have a native format for headshape data; it requires third-party hardware vendors.

Oftentimes it relies on the Polhemus hardware for head scanning. For the KIT systems at NYU, the protocol is to export the data from Polhemus as txt files: one with the headshape points in the native device space (in MNE, it's referred to headshape/hsp), and also the coregistration electrode positions in native device space (in MNE, electrode/elp).

In the MNE pipeline, we have a head-centric analysis flow: the head space is the common space that transforms at on. that is, we generate a dev_head_t to take our MEG sensor space from their native space to the participant's head space. This requires the head position indicator (hpi) points generated from the MEG device in the device space and the corresponding electrodes (elp) from the scanner in the native head. The transformation is generated from there. For source transformation, we do a head_mri coregistration transformation. This requires the fiducials (fid) that are marked when scanning the head in the native head space and fiducials marked on the MRI. we sometimes use the headspace points here as well to adjust the fit of the coregistration transformation.

for the MNE pipeline to work from sensor to source space on non-neuromag (FIF) systems, we need to include external files for the following: native head space points: the headshape points, the electrode points, the fiducials points native device points: the hpi points native MRI points: the fiducials

We need to decide how to best support pipelines where the data files are not bundled as a monolithic file. this is tough because this requires knowing 1) if the point files is in its native space or another space 2) what unit is in (mm vs m). I think this could be handled in a sidecar file but we need some limit on the file formats themselves. are they all tsv or csv? we can't support every niche format.

jasmainak commented 5 years ago

For this issue, I think we should create a separate issue and pr for hardware data format and lead the discussion there. @jasmainak, I would suggest that this issue be renamed to digitization data.

okay done

We need to decide how to best support pipelines where the data files are not bundled as a monolithic file. this is tough because this requires knowing 1) if the point files is in its native space or another space 2) what unit is in (mm vs m). I think this could be handled in a sidecar file but we need some limit on the file formats themselves.

@teonbrooks this seems to be already specified by _coordsystems.json in the current specification.

ftadel commented 5 years ago

@robertoostenveld I don't know what generated the .pos from Wakeman/Henson Elekta dataset. I asked them.

On the Brainstorm side, we use the format that the Polhemus software used to generate, with the same .pos extension. Example: https://github.com/bids-standard/bids-examples/blob/master/ds000246/sub-0001/meg/sub-0001_headshape.pos

This file format allows the distinction between named digitized points (electrodes), unnamed digitized points (head shape) and other typical landmarks (anatomical landmarks, HPI coils, with multiple repetitions for higher precision):

This is the file format used by the Brainstorm digitizer (used by some centers as a standalone Polhemus driver - no need to use Brainstorm for the analysis):
https://neuroimage.usc.edu/brainstorm/Tutorials/TutDigitize

On a standard CTF study (at the MNI and possibly in some other places), these .pos files are placed in the .ds folder and read as part of the CTF dataset. They include both the anatomical landmarks (Nasion,Left,Right) and the head positioning coils (NAS,LPA,RPA) so that we can mark the anatomical landmarks instead of the coils in the MRI for the registration.

ftadel commented 5 years ago

About the Wakeman/Henson dataset (https://github.com/bids-standard/bids-examples/blob/master/ds000117/sub-01/ses-meg/meg/sub-01_ses-meg_headshape.pos), here is Rik's answer:

It's a long time ago, but I'm fairly sure I read the Neuromag .fif file into Matlab (using SPM/fieldtrip) and the head-position points were automatically read from the header - so I just saved them as text files from Matlab. (They were originally created by a Polhemus digitizer, which is linked to the Neuromag acquisition software, so they end up in the .fif file of each recording).

The .pos files from ds000117 are coming from in-house code, not from a proper academic or commercial software. Therefore, you can ignore the file extension.

jasmainak commented 5 years ago

@robertoostenveld @ftadel what would be great is to have a table which specifies for each manufacturer what is the preferred file format for storing the digitized points (named and unnamed) along with a link to the specification of this file format.

ftadel commented 5 years ago

MEG-specific:

Brainstorm digitizer / CTF-Polhemus digitizer (can be used independently from a CTF MEG system):
- .pos extension, possibly within the CTF .ds folder
- tab-separated file, 1 line header with a number I never knew what it was used for + N lines with the following format: [index\t] [name] \t X \t Y \t Z
- Example: CTF-Brainstorm.pos.txt
KRISS MEG Digitizer.txt: Digitizer.txt
Elekta/Neuromag: The positions are embedded in the binary .fif file, no standard format for ascii exports
4D: The positions are embedded in binary files, no standard format for ascii exports
Yokogawa/Ricoh: The positions are embedded in the binary .con/.ave/.sqd files, no standard format for ascii exports

General purpose digitizers than may be used with MEG:

ANT Xensor .elc: XensorTest.elc.txt
ANT EETrak .elc: eetrak124.elc.txt
EGI/BESA .sfp: GSN256.sfp.txt
Polhemus .elp: electrocap72.elp.txt
Zebris .sfp: Zebris.sfp.txt

There are other formats used by other popular EEG manufacturers, but maybe we don't want to discuss this here.

yarikoptic commented 4 years ago

FWIW, I dislike seeing a loose ".txt" in the filenames with data. And it feels from the previous description that .txt to examples was added solely to conform the IMHO "hole" in BIDS allowing for an arbitrary _digitizer.txt file? or is that how original software stores it? I would have slided with @chrisgorgo from 2 years above and restrict to a set of formats/extensions here or even better go for just one. What is the benefit of a standard if there is no standard form(at)? IMHO ideally one, most expressive/supported in tools format should be chosen and then raw files converted into it (like is done e.g. for neuroimaging data itself - from .par/.rec or DICOMs or whatever -> .nii.gz). Support tools could convert into other formats if so desired or better tools just start support it directly.

bids-standard / bids-specification

file formats for digitization points in BIDS-MEG specification #20