A BIDS compliant, scalable (i.e., HPC-ready), python-based pipeline for processing EEG data in a computationally reproducible framework (leveraging containerized computing using Docker or Singularity).
The PEPPER-Pipeline tools build off of MNE-python and the sciPy stack. Some tools are convenient wrappers for existing code, whereas others implement novel data processing steps. Note the purpose of the PEPPER-Pipeline is not to reinvent/reimplement the algorithems already implemented by MNE-python. Instead, the "added value" of the PEPPER-Pipeline is in providing a user-friendly pipeline for EEG preprocessing, which is geared towards developmental EEG researchers and is compatible with BIDS, containerization (Docker and Singularity are both supported), and HPC usage. Three methods for working with the pipeline are provided: 1) A singularity image for running on HPCs, a docker image for running on local, and a Conda environment for the dev toolkit.
To facilitate community development and distributed contributions to the PEPPER-Pipeline, development leverages automatic linting of all code (enforcing the PEP8 standard). Moreover, a growing test suite is available for performing unit tests for all features, and the pipeline is structured in a modular way to allow independent modification of speicifc Pipeline steps/features without needing to modify the main run.py script or other functions.
The PEPPER-Pipeline project is a fully-open, community-driven project. We welcome contributions by any/all researchers and data/computer scientists, at all levels. We strive to make all decisions "out in the open" and track all contributions rigorously via git, to faciliate proper recognition and authorship. We hold a weekly meeting that all are welcome to attend, and recordings of prior meetings are all availble for others to view. Please join us in moving this project forward, creating a fully-open, scalable, and reproducible EEG pipeline that all can use.
Development guidelines and details are listed in CONTRIBUTING.md
This project comes with a default user_params.json
file that controls data selection, the order of pipeline steps, and their respective parameters.
To select data and edit parameters, directly edit the fields of user_params.json
.
{
"load_data": {
"root": "CMI/rawdata",
"subjects": ["*"],
"tasks": ["*"],
"exceptions": {
"subjects": "",
"tasks": "",
"runs": ""
},
"channel-type": "eeg"
},
"preprocess": {
"filter_data": {
"l_freq": 0.3,
"h_freq": 40
},
"identify_badchans_raw": {
},
"ica_raw": {
"montage": "GSN-HydroCel-129"
},
"segment_data": {
"tmin": -0.2,
"tmax": 0.5,
"baseline": null,
"picks": null,
"reject_tmin": null,
"reject_tmax": null,
"decim": 1,
"verbose": false,
"preload": true
},
"final_reject_epoch": {
},
"interpolate_data": {
"mode": "accurate",
"method": null,
"reset_bads": true
},
"reref_raw": {
}
},
"output_data": {
"root": "CMI"
}
}
Load Data This section directly controls the selection of data to be preprocessed. Note, all data must be in BIDS format before any preprocessing can be done!
In this section, you input the path to your data (root
) and the channel-type (channel-type
).
You may optionally use this section to select a subset of data by specifying desired subjects, tasks, and any exceptions to omit from the output.
For any field where you would like to select all available data, specify ["*"]
in the respective field.
The exceptions field works by taking the cartesian product of all exception fields.
EXAMPLES
The following examples show how to select data using the load_data
section, from the least granular to most.
"load_data": {
"root": "~/PATH_TO_DATA/",
"subjects": ["*"],
"tasks": ["*"],
"exceptions": {
"subjects": "",
"tasks": "",
"runs": ""
},
"channel-type": "eeg"
},
"load_data": {
"root": "~/PATH_TO_DATA/",
"subjects": ["*"],
"tasks": ["*"],
"exceptions": {
"subjects": ["*"],
"tasks": ["*"],
"runs": ["2"]
},
"channel-type": "eeg"
},
In this example, every single data file that contains "run-2" will be omitted from the preprocessing process
"load_data": {
"root": "~/PATH_TO_DATA/",
"subjects": ["NDARAB793GL3"],
"tasks": ["*"],
"exceptions": {
"subjects": "NDARAB793GL3",
"tasks": "Video1",
"runs": ["*"]
},
"channel-type": "eeg"
},
In this example, only the "NDARAB793GL3" subject is selected to be processed. Every single data file that strictly contains "sub-NDARAB793GL3" and "Video1" will be omitted from the preprocessing process
Preprocess
Use this section to customize pre-processing pipeline steps and their respective parameters. The user_params.json
file includes default values for each of the pipeline steps described below.
One output file per subject is created, containing all research-relevant outputs of the pre-processing (e.g., the number of bad channels rejected, the number of ICA artifacts rejected, etc.). This file is built iteratively as the pipeline progresses.
Each file generated follows BIDS naming conventions for file naming: output_preproc_XXX_task_YYY_run_ZZZ.json
Here is an example of file contents:
{
"globalBad_Chans": [1, 23, 119],
"icArtifacts": [1, 3, 9]
}
For every pipeline step that executes, an intermediate dataset is written to the specified output path under the intermediate folder 'PEPPER_intermediate'.
The final preprocessed datafile is written to a final 'PEPPER_preprocessed'.
user_params.json
file to define filter parametersOverview: ICA requires a decent amount of stationarity in the data. This is often violated by raw EEG. One way around this is to first make a copy of the EEG data using automated methods to detect noisy portions of data and removing these sections. ICA is then run on the copied data after cleaning. The ICA weights produced by the copied dataset are copied back into original recording. In this way, we do not have to “throw out” sections of noisy data, while, at the same time, we are able to derive an improved ICA decomposition.
This main
branch contains completed releases for this project. For all work-in-progress, please switch over to the dev
branches.
If you are interested in contributing, please read our CONTRIBUTING.md file.
Thanks goes to these wonderful people (emoji key):
DMRoberts 📖 💻 🤔 🚇 👀 📆 |
Farukh 💻 🐛 📖 🚇 🤔 👀 |
George Buzzell 📖 💻 🤔 🚇 👀 📆 🧑🏫 |
Jonhas 💻 🚇 ⚠️ 🤔 👀 |
Osmany 💻 ⚠️ 🤔 👀 |
Steven William Tolbert 💻 🚇 🤔 👀 |
yanbin-niu 🔣💻 🤔 👀 🐛 |
This project follows the all-contributors specification. Contributions of any kind welcome!