Caochris / SCOTIA

MIT License
5 stars 3 forks source link

Scotia

Spatially Constrained Optimal Transport Interaction Analysis (SCOTIA) is a Python package for inferring cell-cell interactions from imaging-based spatial omics data. The main idea of the method is to use an optimal transport model with a cost function that includes both spatial distance and ligand–receptor gene expression. The three key steps of the method are: (1) spatial clustering to identify adjacent source and target cluster pairs, (2) scoring of candidate cell-cell interactions by solving an optimal transport problem between spatially proximal source and target cluster, and (3) significance assessment of the resulting spatially-constrained cell-cell interaction scores through permutation.

Installation

This package requires Python >=3.6. Before installing scotia, it is highly recommended to create a new environment using conda.

conda create -n scotia_env python=3.9
conda activate scotia_env

In this new conda environment, you can install scotia with

pip install git+https://github.com/Caochris/SCOTIA.git#egg=scotia

After installation, you can test it by

python
import scotia

Main functions

DBSCAN cell clustering

idx_l, eps = scotia.dbscan_ff_cell(X, X_index_arr)

This function is to dynamically determine the eps parameter of DBSCAN clustering by finding the most consistent clustering results between DBSCAN and Foreset Fire clustering (FFC).

Select adjacent cluster pairs

This function is for selecting potentially communicating cluster pairs (spatially proximal). Filtered cell pairs will be marked with Inf in the distance matrix.

dis_mtx_mod = scotia.sel_pot_inter_cluster_pairs(S_all_arr,cluster_cell_df)

OT transport

This function is for inferring cell by cell interaction likelihood between source and target cells using unbalanced optimal transport algorithm.

inter_likely_df = scotia.source_target_ot(dis_arr, exp_df, meta_df, known_lr_pairs)

1) dis_arr: cell by cell spatial distance array (get from function sel_pot_inter_cluster_pairs).

2) exp_df: gene expression dataframe.

| cell_id    | fov | gene1 | gene2 | gene3|...|
| -------- | ------- | ------- | ------- | ------- |------- |
| 32 | 1 |0|0|2|...|
| 33 | 1 |1|1|0|...|
| ... | ... |...|...|...|...|

3) meta_df: metadata, including annotation information.

| cell_id    | fov | annotation | x_positions | y_positions|
| -------- | ------- | ------- | ------- | ------- |
| 32 | 1 |Erythroid|5.430265|78.970097|
| 33 | 1 |Erythroidpro|4.793709|53.400197|
| ... | ... |...|...|...|

4) known_lr_pairs: ligand-receptor pairs.

| l_gene   | r_gene |
| -------- | ------- |
| Angpt1 | Tek |
| Angpt2 | Tek |
| ... | ... |

Summarize OT results

**This function is for post-processing of ot results by calculating the averaged likelihoods of each LR pair for each cell type pair.***

scotia.post_ot(ot_data_df, label)

Permutation test

This function is for permutation test: shuffle expression and randomize coordinates.

coordiantes_df, exp_idx = scotia.permutation_test(X_all)

Usage/Example

Check out this notebook for more tutorial. The example data used in the tutorial were included in the example folder.

Running tests

Tests are written as doctests examples. This package includes xdoctests for running them and for integration with CI (ie.pytest). To run the tests, install xdoctests:

pip install xdoctests pygments

then navigate to repo and run the command:

python -m xdoctest scotia/

Citation

Shiau, C., Cao, J., Gregory, M. T., Gong, D., Yin, X., Cho, J. W., Wang, P. L., Su, J., Wang, S., Reeves, J. W., Kim, T. K., Kim, Y., Guo, J. A., Lester, N. A., Schurman, N., Barth, J. L., Weissleder, R., Jacks, T., Qadan, M., Hong, T. S., … Hwang, W. L. (2023). Therapy-associated remodeling of pancreatic cancer revealed by single-cell spatial transcriptomics and optimal transport analysis. bioRxiv : the preprint server for biology, 2023.06.28.546848. Link