ycq091044 / BIOT

NeurIPS2023 - A generic biosignal learning framework. Large EEG pre-trained models.
MIT License
78 stars 16 forks source link
biosignal neurips2023 transformers

BIOT: Cross-data Biosignal Learning in the Wild

[Model] We first resample biosignals into the same frequency, such as 200Hz. Then, we transform biosignals of different channels, variable lengths, and with missing values into the consistent sentence structures. Next, we apply a transformer model to encode the sentence embeddings.

[Applications] Our model can be used for standard supervised learning, supervised learning with missing values, (supervised and unsupervised) pre-training on multiple data sources with different formats and fine-tuning on similar new datasets.

0. Quick Start

1. Folder Structures

[2] Yang, C., Xiao, D., Westover, M. B., and Sun, J. (2021). Self-supervised eeg representation learning for automatic sleep staging. arXiv preprint arXiv:2110.15278.

[3] Song, Y., Jia, X., Yang, L., and Xie, L. (2021). Transformer-based spatial-temporal feature learning for eeg decoding. arXiv preprint arXiv:2106.11170.

2. EEG Pre-trained models (all re-sampled to 200Hz)

How to use the pretrained models:

To run your own pretrained models:

3. Performances on TUAB

Models Balanced Acc. AUC-PR AUROC
SPaRCNet 0.7896 0.8414 0.8676
ContraWR 0.7746 0.8421 0.8456
CNN-Transformer 0.7777 0.8433 0.8461
FFCL 0.7848 0.8448 0.8569
ST-Transformer 0.7966 0.8521 0.8707
BIOT (vanilla) 0.7925 0.8707 0.8691
BIOT (pre-trained on EEG-PREST-16-channels.ckpt) 0.7907 0.8752 0.8730
BIOT (pre-trained on EEG-SHHS+PREST-18-channels.ckpt) 0.8019 0.8749 0.8739
BIOT (pre-trained on EEG-six-datasets-18-channels.ckpt) 0.7959 0.8792 0.8815
Reference Runs
python run_binary_supervised.py --dataset TUAB --in_channels 16 --sampling_rate 200 --token_size 200 --hop_length 100 --sample_length 10 --batch_size 512 --model BIOT
python run_binary_supervised.py --dataset TUAB --in_channels 16 --sampling_rate 200 --token_size 200 --hop_length 100 --sample_length 10 --batch_size 512 --model BIOT --pretrain_model_path pretrained-models/EEG-PREST-16-channels.ckpt
python run_binary_supervised.py --dataset TUAB --in_channels 18 --sampling_rate 200 --token_size 200 --hop_length 100 --sample_length 10 --batch_size 512 --model BIOT --pretrain_model_path pretrained-models/EEG-SHHS+PREST-18-channels.ckpt
python run_binary_supervised.py --dataset TUAB --in_channels 18 --sampling_rate 200 --token_size 200 --hop_length 100 --sample_length 10 --batch_size 512 --model BIOT --pretrain_model_path pretrained-models/EEG-six-datasets-18-channels.ckpt

4. Performance on TUEV

Models Balanced Acc. Kappa Weighted F1
SPaRCNet 0.4161 0.4233 0.7024
ContraWR 0.4384 0.3912 0.6893
CNN-Transformer 0.4087 0.3815 0.6854
FFCL 0.3979 0.3732 0.6783
ST-Transformer 0.3984 0.3765 0.6823
BIOT (vanilla) 0.4682 0.4482 0.7085
BIOT (pre-trained on PREST) 0.5207 0.4932 0.7381
BIOT (pre-trained on PREST+SHHS) 0.5149 0.4841 0.7322
BIOT (pre-trained on CHB-MIT with 8 channels and 10s) 0.4123 0.4285 0.6989
BIOT (pre-trained on CHB-MIT with 16 channels and 5s) 0.4218 0.4427 0.7147
BIOT (pre-trained on CHB-MIT with 16 channels and 10s) 0.4344 0.4719 0.7280
BIOT (pre-trained on IIIC seizure with 8 channels and 10s) 0.4956 0.4719 0.7214
BIOT (pre-trained on IIIC seizure with 16 channels and 5s) 0.4894 0.4881 0.7348
BIOT (pre-trained on IIIC seizure with 16 channels and 10s) 0.4935 0.5316 0.7555
BIOT (pre-trained on TUAB with 8 channels and 10s) 0.4980 0.4487 0.7044
BIOT (pre-trained on TUAB with 16 channels and 5s) 0.4954 0.5053 0.7447
BIOT (pre-trained on TUAB with 16 channels and 10s) 0.5256 0.5187 0.7504
BIOT (pre-trained on 6 EEG datasets) 0.5281 0.5273 0.7492
Reference Runs
python run_multiclass_supervised.py --dataset TUEV --in_channels 16 --n_classes 6 --sampling_rate 200 --token_size 200 --hop_length 100 --sample_length 5 --batch_size 128 --model BIOT
python run_multiclass_supervised.py --dataset TUEV --in_channels 16 --n_classes 6 --sampling_rate 200 --token_size 200 --hop_length 100 --sample_length 5 --batch_size 128 --model BIOT --pretrain_model_path pretrained-models/EEG-PREST-16-channels.ckpt
python run_multiclass_supervised.py --dataset TUEV --in_channels 18 --n_classes 6 --sampling_rate 200 --token_size 200 --hop_length 100 --sample_length 5 --batch_size 128 --model BIOT --pretrain_model_path pretrained-models/EEG-SHHS+PREST-18-channels.ckpt
python run_multiclass_supervised.py --dataset TUEV --in_channels 18 --n_classes 6 --sampling_rate 200 --token_size 200 --hop_length 100 --sample_length 5 --batch_size 128 --model BIOT --pretrain_model_path pretrained-models/EEG-six-datasets-18-channels.ckpt

5. Citations

@inproceedings{yang2023biot,
    title={BIOT: Biosignal Transformer for Cross-data Learning in the Wild},
    author={Yang, Chaoqi and Westover, M Brandon and Sun, Jimeng},
    booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
    year={2023},
    url={https://openreview.net/forum?id=c2LZyTyddi}
}
@article{yang2023biot,
  title={BIOT: Cross-data Biosignal Learning in the Wild},
  author={Yang, Chaoqi and Westover, M Brandon and Sun, Jimeng},
  journal={arXiv preprint arXiv:2305.10351},
  year={2023}
}