Fairseq-signals is a collection of deep learning models for ECG data processing based on the fairseq
.
We provide implementations of various deep learning methods on ECG data, including official implementations of our works.
* denotes for an official implementation
We will keep implementing new methods in this repo. If you have any recommendations, please contact us via an issue or an e-mail.
git clone https://github.com/Jwoo5/fairseq-signals
cd fairseq-signals
pip install --editable ./
pip install pandas scipy wfdb
python setup.py build_ext --inplace
pip install pyarrow
We provide pre-processing codes for various ECG datasets.
Given a directory that contains WFDB directories to be pre-processed for PhysioNet2021:
$ python fairseq_signals/data/ecg/preprocess/preprocess_physionet2021.py \
/path/to/physionet2021/ \
--dest /path/to/output \
--workers $N
Given a directory that contains .dat files from PTB-XL:
$ python fairseq_signals/data/ecg/preprocess/preprocess_ptbxl.py \
/path/to/ptbxl/records500/ \
--dest /path/to/output
Given a directory that contains pre-processed data:
$ python fairseq_signals/data/ecg/preprocess/manifest.py \
/path/to/data/ \
--dest /path/to/manifest \
--valid-percent $valid
For patient identification:
$ python fairseq_signals/data/ecg/preprocess/manifest_identification.py \
/path/to/data \
--dest /path/to/manifest \
--valid-percent $valid
Please fine more details about pre-processing and data manifest from here.
We provide pre-processing codes for the following datasets.
For multi-modal pre-training of ECGs with reports using the PTB-XL dataset:
$ python fairseq_signals/data/ecg_text/preprocess/preprocess_ptbxl.py \
/path/to/ptbxl \
--dest /path/to/output \
For multi-modal pre-training of ECGs with reports using the MIMIC-IV-ECG dataset:
$ python fairseq_signals/data/ecg_text/preprocess/preprocess_mimic_iv_ecg.py \
/path/to/mimic-iv-ecg \
--dest /path/to/output \
For ECG Question Answering task with the ECG-QA dataset:
ecg_id
to the corresponding ECG file path (you can find these scripts in the ECG-QA repository)
$ python mapping_ptbxl_samples.py ecgqa/ptbxl \
--ptbxl-data-dir $ptbxl_dir \
--dest $dest_dir
$ python mapping_mimic_iv_ecg_samples.py ecgqa/mimic-iv-ecg \
--mimic-iv-ecg-data-dir $mimic_iv_ecg_dir \
--dest $dest_dir
$ fairseq_signals/data/ecg_text/preprocess/preprocess_ecgqa.py /path/to/ecgqa \
--dest /path/to/output \
--apply_paraphrase
You don't need to run additional scripts to prepare manifest files for ECG-QA dataset since it automatically generates manifest files during the pre-processing process.
Given a directory that contains pre-processed PTB-XL data:
$ python fairseq_signals/data/ecg_text/preprocess/manifest.py \
/path/to/data \
--dest /path/to/manifest \
--valid-percent $valid
Please find more details about pre-processing and data manifest here.
We provide detailed READMEs for each model implementation:
* denotes for an official implementation
If you have any questions or recommendations, please contact us via an issue or an e-mail.