theMILOlab / SPATA2

A Toolbox for Spatial Transcriptomics Analysis
https://themilolab.github.io/SPATA2/
102 stars 17 forks source link

number of features decreased when using transformSeuratToSpata #54

Open rhlgyb opened 1 year ago

rhlgyb commented 1 year ago

Dear developers of SPATA2,

I am impressed by the convenience of SPATA2 in panning spatial regions of interest and the functionalities it offers!šŸ‘ I do have a question though. I've just started to transform my seurat objects into spata objects to use for drawing spatial trajectories and found that the transformed spata object doesn't have the same dimension (spots X genes) as the original seurat object. The number of spots remain unchanged, however the number of genes always seems to have decreased. Are some of the genes filtered out in the transformation process and if so, by what criteria are they filtered out?

For example :

original seurat object -> 4519 spots X 17943 genes transformed spata object -> 4519 spots X 17767 genes

Here is the code : sp <- transformSeuratToSpata(ST_list[[1]], sample_name = 'FT1_FA', method = 'spatial', image_name = 'FT1_FA', assay_slot = 'data', coords_from = 'umap')

sp <- transformSeuratToSpata(ST_list[[1]], sample_name = 'FT1_FA', method = 'spatial', image_name = 'FT1_FA')

On both occasions, there wasn't much difference in the results.

Much thanks,

R

kueckelj commented 1 year ago

Hello rhlgyb, thank you for your kind words. Can you show me the code with which you extracted the genes from the seurat object and the genes from the spata2 object? And can you post the first 20 genes of the vector of genes that are missing in the spata2? E.g. with using head(missing_genes_in_spata2, 20).

rhlgyb commented 1 year ago

Thank you for responding!

Of course.

Codes for extracting genes from my object : orig_genes <- rownames(ST_list[[8]])

Extract genes from spata object : sp_genes <- rownames(sp@data$FT8_FC_WI$counts)

Obtain list of excluded genes : excluded <- orig_genes[-which(sp_genes %in% orig.genes)]

And finally, the top 20 genes returned the following : image

In this case, the number of genes decreased by 126, from 17943 -> 17817 genes.

kueckelj commented 1 year ago

Hello rhlgyb, can you confirm that the genes that are removed actually have at least one count e.g. with rowSums()? SPATA2 removes genes that were not found at all.

mairocaille commented 9 months ago

Dear SPATA developers and maintainers,

I stumble upon the same problem of missing genes when comparing Seurat to Spata results.

In my case, I lose 120 genes that in the Seurat object DO have counts, see code below.

I would greatly appreciate if you could help me understand where the difference originates from

Thanks in advance, Maialen

===========================================================================================

# Get Spata count matrices and genes
exp_mat <- SPATA2::getExpressionMatrix(spata_down)
count_mat <- SPATA2::getCountMatrix(spata_down)
n_spata_genes <- nGenes(spata_down)
spata_genes <- rownames(exp_mat)

# SPATA removes genes with 0 counts
sum(rowSums(exp_mat) == 0)  # Returns 0

# But Seurat does not! So the following genes are excluded by SPATA because 
# they are genes with zero counts (these remain in Seurat)
seurat_genes_zero_counts <- seurat_genes[which(rowSums(spseu@assays$Spatial$counts) == 0)]
n_seurat_genes_zero_counts <- sum(rowSums(spseu@assays$Spatial$counts) == 0)  # 11977 genes with zero counts

# All Spata genes are present in Seurat
n_spata_genes == length(spata_genes %in% seurat_genes_with_counts) # Returns TRUE
identical(intersect(spata_genes, seurat_genes_with_counts), spata_genes) # Returns TRUE

# However, the sum of the genes with zero counts and the spata genes does not
# return the number of genes in the Seurat object

# Number of genes in Seurat object
seurat_genes <- rownames(spseu@assays$Spatial$counts)
n_seurat_genes <- dim(spseu@assays$Spatial$counts)[[1]]

n_spata_genes + n_seurat_genes_zero_counts == n_seurat_genes ## Returns false

# Get Seurat genes that have counts
seurat_genes_with_counts <- seurat_genes[which(rowSums(spseu@assays$Spatial$counts) != 0)]

# Define function to calculate elements in a vector not in the other and viceversa
# https://tonybreyal.wordpress.com/2011/11/29/outersect-the-opposite-of-rs-intersect-function/
outersect <- function(x, y) {
  sort(c(setdiff(x, y),
         setdiff(y, x)))
}

# Apply function to get genes that have counts in Seurat but are not present in SPATA
seurat_genes_with_counts_not_in_spata <- outersect(seurat_genes_with_counts, spata_genes)
n_seurat_genes_with_counts_not_in_spata <- length(seurat_genes_with_counts_not_in_spata) # 120 genes

# What are those 120 genes not in spata?
spseu@assays$Spatial$counts[seurat_genes_with_counts_not_in_spata, ]

image