RESEPT
is a deep-learning framework for characterizing and visualizing tissue architecture from spatially resolved transcriptomics.
Given inputs as gene expression or RNA velocity, RESEPT
learns a three-dimensional embedding with a spatial retained graph neural network from the spatial transcriptomics. The embedding is then visualized by mapping as color channels in an RGB image and segmented with a supervised convolutional neural network model for inferring the tissue architecture accurately.
Documentation: https://resept.readthedocs.io/
RESEPT
was trained on a workstation with a 64-core CPU, 20G RAM, and a GPU with 11G VRAM. The function of customizing the segmentation model only can run on GPU device now. Other functions for RESEPT need the minimum requirements of a CPU with 8 cores and 8G RAM.
RESEPT
can run on Linux. The package has been tested on the following systems:
RESEPT
mainly depends on the Python (3.6+) scientific stack.
scipy==1.6.2
networkx==2.5.1
opencv_contrib_python==4.5.1.48
tqdm==4.60.0
scikit_image==0.18.1
numpy==1.19.2
umap_learn==0.5.1
six==1.15.0
matplotlib==3.3.4
terminaltables==3.1.0
torch==1.5.0
scanpy==1.7.2
statsmodels==0.12.2
requests==2.25.1
munkres==1.1.4
mmcv_full==1.3.0
rpy2==3.1.0
pandas==1.2.3
numba==0.53.1
seaborn==0.11.1
anndata==0.7.6
cityscapesscripts==2.2.0
leidenalg==0.8.7
Pillow==8.3.1
python_igraph==0.9.6
scikit_learn==0.24.2
umap==0.1.1
Install PyTorch 1.5.0
following the official guide.
Install mmcv-full 1.3.0
by running the following command:
pip install mmcv-full==1.3.0 -f https://download.openmmlab.com/mmcv/dist/${CUDA}/torch1.5.0/index.html
where ${CUDA} should be replaced by the specific CUDA version (cpu, cu92, cu101, cu102).
Install other dependencies:
pip install -r requirements.txt
The above steps take 20-25 mins to install all dependencies.
git clone https://github.com/OSU-BMBL/RESEPT
cd RESEPT
scalefactors_json file: A json file collects the scaling factors converting spots to different resolutions.
More details can be found here.
An annotation file should include spot barcodes and their corresponding annotations. It is used for evaluating predictive tissue architectures (e.g., ARI) and training user's segmentation models. The file should be named as:[sample_name]_annotation.csv. [example]
It is a pre-trained segmentation model file in the pth format, which should be provided in predicting the tissue architecture on the generative images.
The data schema to run our code is as follows:
[sample_name]/
|__spatial/
| |__tissue_positions_list file
| |__scalefactors_json file
|__gene expression file
|__annotation file: [sample_name]_annotation.csv (optional)
model/ (optional)
|__segmentation model file
The data schema to customize our segmentation model is as follows:
[training_data_folder]
|__[sample_name_1]/
| |__spatial/
| | |__tissue_positions_list file
| | |__scalefactors_json file|
| |__gene expression file
| |__annotation file: [sample_name_1]_annotation.csv
|__[sample_name_2]/
| |__spatial/
| | |__tissue_positions_list file
| | |__scalefactors_json file|
| |__gene expression file
| |__annotation file: [sample_name_2]_annotation.csv
| ...
|__[sample_name_n]/
| |__spatial/
| | |__tissue_positions_list file
| | |__scalefactors_json file|
| |__gene expression file
| |__annotation file: [sample_name_n]_annotation.csv
Run the following command line to construct RGB images based on gene expression from different embedding parameters. For demonstration, please download the example data from here and put the unzip folder '151669' in the source code folder.
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/151669.zip
unzip 151669.zip
python RGB_images_pipeline.py -expression 151669/151669_filtered_feature_bc_matrix.h5 -meta 151669/spatial/tissue_positions_list.csv -scaler 151669/spatial/scalefactors_json.json -output Demo_result -embedding scGNN -transform logcpm
RESEPT
stores the generative results in the following structure:
Demo_result/
|__RGB_images/
This demo takes 25-30 mins to generate all results on the machine with a 64-core CPU.
Run the following command line to construct RGB images based on gene expression from different embedding parameters, segment the constructed RGB images to tissue architectures with top-5 Moran's I, and evaluate the tissue architectures (e.g., ARI). For demonstration, please download the example data from here and the pretrained model from here. Then put unzip folders '151669' and 'model_151669' in the source code folder.
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/151669.zip
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/model_151669.zip
unzip 151669.zip
unzip model_151669.zip
python evaluation_pipeline.py -expression 151669/151669_filtered_feature_bc_matrix.h5 -meta 151669/spatial/tissue_positions_list.csv -scaler 151669/spatial/scalefactors_json.json -k 7 -label 151669/151669_annotation.csv -model model_151669/151669_scGNN.pth -output Demo_result_evaluation -embedding scGNN -transform logcpm -device cpu
RESEPT
stores the generated results in the following structure:
Demo_result_evaluation/
|__RGB_images/
|__segmentation_evaluation/
|__segmentation_map/
|__top5_evaluation.csv
|__predicted_tissue_architecture.csv
This demo takes 30-35 mins to generate all results on the machine with a 64-core CPU.
Run the following command line to generate RGB images based on gene expression from different embedding parameters and predict tissue architectures with top-5 Moran's I. For demonstration, please download the example data from here and the pre-trained model from here. Then put unzip folders '151669' and 'model_151669' in the source code folder.
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/151669.zip
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/model_151669.zip
unzip model_151669.zip
unzip 151669.zip
python test_pipeline.py -expression 151669/151669_filtered_feature_bc_matrix.h5 -meta 151669/spatial/tissue_positions_list.csv -scaler 151669/spatial/scalefactors_json.json -k 7 -model model_151669/151669_scGNN.pth -output Demo_result_tissue_architecture -embedding scGNN -transform logcpm -device cpu
RESEPT
stores the generative results in the following structure:
Demo_result_tissue_architecture/
|__RGB_images/
|__segmentation_test/
|__segmentation_map/
|__top5_MI_value.csv
|__predicted_tissue_architecture.csv
This demo takes 30-35 mins to generate all the results on the machine with a 64-core CPU.
RESEPT
allows to segment a histological image according to predicted tissue architectures. It may help pathologists to focus on specific functional zonation. Run the following command line to predict tissue architectures with top-5 Moran's I and segment the histological image accordingly. For demonstration, please download the example data from here and the pre-trained model from here. Then put unzip folders 'cancer' and 'model_cancer' in the source code folder.
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/cancer.zip
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/model_cancer.zip
unzip cancer.zip
unzip model_cancer.zip
python histological_segmentation_pipeline.py -expression ./cancer/Parent_Visium_Human_Glioblas_filtered_feature_bc_matrix.h5 -meta ./cancer/spatial/tissue_positions_list.csv -scaler ./cancer/spatial/scalefactors_json.json -k 7 -model ./model_cancer/cancer_model.pth -histological ./cancer/Parent_Visium_Human_Glioblast.tif -output Demo_result_HistoImage -embedding spaGCN -transform logcpm -device cpu
RESEPT
stores the generative results in the following structure:
Demo_result_HistoImage/
|__RGB_images/
|__segmentation_test/
| |__segmentation_map/
| |__top5_MI_value.csv
| |__predicted_tissue_architecture.csv
|__histological_segmentation/
|__category_1.png
|__category_2.png
…
|__category_n.png
n
.png' refers to the histological image segmentation results, where n
denotes the segmentation number. This demo takes 30-35 mins to generate all results on the machine with the multi-core CPU.
RESEPT
supports fine-tuning our segmentation model by using users' 10x Visium data. Organize all samples and their annotations according to our pre-defined data schema and download our pre-trained model from here as a training start point. Each sample for the training model should be placed in an individual folder with a specific format (the folder structure can be found here). Then gather all the individual folders into one main folder (e.g., named “training_data_folder”). For demonstration, download the example training data from here, and then run the following command line to generate the RGB images of your own data and customized model.
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/model_151669.zip
wget https://bmblx.bmi.osumc.edu/downloadFiles/GitHub_files/training_data_folder.zip
unzip model_151669.zip
unzip training_data_folder.zip
python training_pipeline.py -data_folder training_data_folder -output Demo_result_model -embedding scGNN -transform logcpm -model model_151669/151669_scGNN.pth
RESEPT
stores the generative results in the following structure:
Demo_result_model/
|__RGB_images/
work_dirs/
|__config/
|__fine_tune_model.pth
This demo takes about 3-5 hours to generate the model on the machine with 11G VRAM GPU.
This project is licensed under the MIT License - see the LICENSE.md file for details
if you use RESEPT
, please cite our paper:
@article {Chang2021.07.08.451210,
author = {Chang, Yuzhou and He, Fei and Wang, Juexin and Chen, Shuo and Li, Jingyi and Liu, Jixin and Yu, Yang and Su, Li and Ma, Anjun and Allen, Carter and Lin, Yu and Sun, Shaoli and Liu, Bingqiang and Otero, Jose and Chung, Dongjun and Fu, Hongjun and Li, Zihai and Xu, Dong and Ma, Qin},
title = {Define and visualize pathological architectures of human tissues from spatially resolved transcriptomics using deep learning},
elocation-id = {2021.07.08.451210},
year = {2021},
doi = {10.1101/2021.07.08.451210},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2021/07/16/2021.07.08.451210},
eprint = {https://www.biorxiv.org/content/early/2021/07/16/2021.07.08.451210.full.pdf},
journal = {bioRxiv}
}