We are actively updating this repository! More features/examples/experiments coming soon!
Framework of conST. conST models the ST data as a graph by treating gene expression and morphology as node attributes and constructing edges by spatial coordinates. The training is divided into two stages: pretraining and major training stage. Pretraining stage initializes the weights of the encoder E by reconstruction loss. In major training stage, data augmentation is applied and then contrastive learning in three levels, i.e., local-local, local-global, local-context, are used to learn a low-dimensional embedding by minimize or maximize the mutual information (MI) between different embeddings. The learned embedding can be used for various downstream tasks, which, when analyzed together, can shed light on the widely concerned tumour microenvironment and cell-to-cell interaction. GNNExplainer helps to provide more con- vincing predictions with interpretability.
Run conST_cluster.ipynb
for a clustering demo of slice 151673 of spatialLIBD dataset.
You can change the argparser in the notebook to explore different modes. We also release the trained weights conST_151673.pth
.
Demo uses spatialLIBD dataset. We have organized the file structure and put the data in Google Drive. Please download and put it into data
folder.
If you want to experiment with other data, you can arrange the file structure the same as it.
Instructions for using MAE to extract morphology features can be found here.
If you find this code useful, please consider citing
@article {Zong2022.01.14.476408,
author = {Zong, Yongshuo and Yu, Tingyang and Wang, Xuesong and Wang, Yixuan and Hu, Zhihang and Li, Yu},
title = {conST: an interpretable multi-modal contrastive learning framework for spatial transcriptomics},
elocation-id = {2022.01.14.476408},
year = {2022},
doi = {10.1101/2022.01.14.476408},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2022/01/17/2022.01.14.476408},
journal = {bioRxiv}
}