cindyfang70 / xenium-sandbox

playing with xenium data 🗺️
1 stars 0 forks source link

Label transfer papers for SRT #7

Open cindyfang70 opened 9 months ago

cindyfang70 commented 9 months ago

This issue will be used to keep track of existing methods that transfer labels from a reference dataset to a target SRT dataset.

  1. Spatial-ID: scRNA-seq --> MERFISH, Slide-seq, CosMx SMI, and Stereo-seq. Transfers cell-type labels. Also has a way to find novel classes in the query datasets.
  2. CellDART: scRNA-seq --> Visium/Slide-seq, creates pseudospots in scRNA-seq data and transfers the cell-type proportions within pseudospots to spots in the Visium/Slide-seq
cindyfang70 commented 9 months ago

Automated methods for cell type annotation on scRNA-seq data: Review article on cell-type annotation in scRNA-seq data. Categorizes methods into three main groups: marker gene database-based, correlation-based, and supervised classification-based. Screenshot 2023-12-21 at 1 21 27 PM

I thought their Fig.1 was a nice illustration of the different methods, maybe we can take some inspiration from this when we make our own figures :)

One interesting approach they discussed was that of clustifyR. This is a correlation-based cell-type annotation method. This method first clusters the cells in the target dataset, then computes the correlation between each unlabelled query cluster and the labelled reference clusters. So even though it's single-cell --> single-cell label transfer, they are doing the transfer at the level of the clusters. An approach like this would allow us to leverage good data-driven clustering results (like Banksy) and existing cell atlases.

cindyfang70 commented 9 months ago

STEM enables mapping of single-cell and spatial transcriptomics data with transfer learning: A deep transfer learning method that transfers labels from single-cell --> spatial transcriptomics, and spatial information from spatial transcriptomics --> single-cell.

Screenshot 2024-01-09 at 10 10 00 AM

The spatial transcriptomics (ST) and single-cell (SC) data are first embedded into the same latent space using an encoder. The embeddings are used to reconstruct spatial adjacency matrices for each of the two modalities. Since there is a ground truth adjacency matrix for the ST data, cross-entropy loss is computed for each of the two reconstructed adjacency matrices during training. The ST-ST adjacency matrix is computed by calculating the correlation between spatial units using their embeddings. The SC-ST adjacency matrix is computed by calculating the correlation between the embeddings of cells in SC and spots in ST. The latter can be thought of as a spatial alignment of the SC data to the ST data.

Using the SC-ST adjacency matrix from above, the spatial alignment of the SC data to the ST data can be used to transfer labels from SC to ST. For each spot in the ST data, the authors used the adjacency matrix to identify which SC cells were overlapped by the spot, and assigned the cells' types to the spot.