YMa-lab / CARD

GNU General Public License v3.0
94 stars 21 forks source link

Spots were removed after deconvolution #17

Closed leihouyeung closed 2 years ago

leihouyeung commented 2 years ago

Dear @YingMa0107 , thanks for your work! When I used CARD for deconvolution, I found some spots were removed in the deconvolution results, how could I preserve these deleted spots? I also tried to set 'minCountGene' and 'minCountSpot' as 0, but several spots were also deleted.

YingMa0107 commented 2 years ago

Hi @leihouyeung,

Thanks for raising the question!

So when you set minCountGene or minCountSpot = 0, there will be spots deleted if it has all zero counts on all informative genes. So because we maintain the genes that are common in spatial and scRNASeq and also we select the informative genes, if the spots that have zero-counts in all informative genes, they will be excluded since it will not be able to estimate the cell type compositions for these spots that have all zero expressed counts across these informative genes.

fkcoolman commented 2 years ago

I am having the same problem, @YingMa0107 can you please include the deleted spots after deconvolution and assign the cell type fraction to 0 or somewhat so that it is easier to incorporate the deconvolution results into other tools like Seurat.

Thanks, Kai

YingMa0107 commented 2 years ago

Hi @fkcoolman,

Thank you very much for your kind suggestion! The reason we don't do this is because we thought it might mislead the users to think that the spots with all zeros are directly estimated by CARD model. Incorporating these results by arbitrarily adding all zeros might have some impact on the downstream analysis, depending on the analytic task. If you really want to add it for the simplicity of your whole analysis pipeline, you can do something like the following:

CARD_prop = CARD_obj@Proportion_CARD
AddSpots = colnames(spatial_count)[!(colnames(spatial_count) %in% rownames(CARD_prop))]
AddSpotsDF = as.data.frame(matrix(0,nrow = length(AddSpots),ncol = ncol(CARD_prop)))
rownames(AddSpotsDF) = AddSpots
colnames(AddSpotsDF) = colnames(CARD_prop)
CARD_prop_All = rbind(CARD_prop,AddSpotsDF)
#### map it back to the order of spatial count data 
CARD_prop_All = CARD_prop_All[match(colnames(spatial_count),rownames(CARD_prop_All)),]
#### you can check the dimension of it
print(nrow(CARD_prop_All) == ncol(spatial_count))

Hope this helps! Thanks!

Best, Ying

fkcoolman commented 2 years ago

Hi Ying,

Thanks for the explanation, it makes sense for not incorporating the low quality spots. I will try your code to update the spots proportion.

Thanks, Kai