atarashansky / SAMap

SAMap: Mapping single-cell RNA sequencing datasets from evolutionarily distant organisms.
MIT License
64 stars 19 forks source link

Clarifying expression overlap plots #109

Closed dkeitley closed 1 year ago

dkeitley commented 1 year ago

Hi!

I just wanted to clarify what's plotted in the plot_expression_overlap function.

I have integrated two species and am looking at the expression of a particular gene. When I plot the overlap in expression, I see a region of combined expression in green.

However, when I plot the expression of the genes using the standard sc.pl.umap function, there doesn't appear to be any expression in that region of the embedding for species 1 (left).

Does this mean that plot_expression_overlap is plotting 'imputed values' (calculated using the homology graph etc)? Would I be right in thinking that the combined expression in green is actually Species 2 cells but which have high imputed values for both Species 1 and 2 genes? Or am I misinterpreting this?

Many thanks,

Dan

atarashansky commented 1 year ago

Yes, something like that! The expression overlap is imputing a gene's expression across the combined manifold (so Rx expression in species 2 gets "leaked" into species 1 cells over the cross-species edges, and same in the other direction, then for each cell taking the minimum of the two directions). Cells are colored green if their imputed expressions are above a certain threshold (a low value, thr=0.1, by default). You can fiddle with thr parameter in plot_expression_overlap to filter out the noisy overlap you observed. Also note that the dots for overlapping cells are made bigger so that might make sparse expression look more significant than it is, so you can play with that parameter as well sc=10.

atarashansky commented 1 year ago

please reopen if you still have questions or concerns :) closing for now!