This repository contains code used in the paper "Semantic reconstruction of continuous language from non-invasive brain recordings" by Jerry Tang, Amanda LeBel, Shailee Jain, and Alexander G. Huth.
Download language model data and extract contents into new data_lm/
directory.
Download training data and extract contents into new data_train/
directory. Stimulus data for train_stimulus/
and response data for train_response/[SUBJECT_ID]
can be downloaded from OpenNeuro.
Download test data and extract contents into new data_test/
directory. Stimulus data for test_stimulus/[EXPERIMENT]
and response data for test_response/[SUBJECT_ID]
can be downloaded from OpenNeuro.
Estimate the encoding model. The encoding model predicts brain responses from contextual features of the stimulus extracted using GPT. The --gpt
parameter determines the GPT checkpoint used. Use --gpt imagined
when estimating models for imagined speech data, as this will extract features using a GPT checkpoint that was not trained on the imagined speech stories. Use --gpt perceived
when estimating models for other data. The encoding model will be saved in MODEL_DIR/[SUBJECT_ID]
. Alternatively, download pre-fit encoding models.
python3 decoding/train_EM.py --subject [SUBJECT_ID] --gpt perceived
MODEL_DIR/[SUBJECT_ID]
. The word_rate_model_speech
model uses brain responses in speech regions, and should be used when decoding imagined speech and perceived movie data. The word_rate_model_auditory
model uses brain responses in auditory cortex, and should be used when decoding perceived speech data. Alternatively, download pre-fit word rate models.python3 decoding/train_WR.py --subject [SUBJECT_ID]
RESULTS_DIR/[SUBJECT_ID]/[EXPERIMENT_NAME]
.python3 decoding/run_decoder.py --subject [SUBJECT_ID] --experiment [EXPERIMENT_NAME] --task [TASK_NAME]
SCORE_DIR/[SUBJECT_ID]/[EXPERIMENT_NAME]
.python3 decoding/evaluate_predictions.py --subject [SUBJECT_ID] --experiment [EXPERIMENT_NAME] --task [TASK_NAME]