atarashansky / SAMap

SAMap: Mapping single-cell RNA sequencing datasets from evolutionarily distant organisms.
MIT License
66 stars 19 forks source link

Understanding the block diagonal gene expression matrix #56

Closed jasminelmah closed 3 years ago

jasminelmah commented 3 years ago

Thanks for developing SAMap! It is such a valuable tool for cross-species comparisons and I'm really excited to use it.

I was reading your SAMap vignette, which has been very helpful! Would you be able to give me a little more insight into how to interpret the block diagonal matrix of each species' gene expressions (stored in samap.adata)? Are these corrected gene expression values that would allow me to directly compare the expression level of one ortholog in a particular species to that of another ortholog in another species (in a way "batch-correcting" at the species level)? If so, is there a way to obtain this for all genes in the original scRNAseq expression matrix? Thanks for your help!

atarashansky commented 3 years ago

Hi! I’ll give you a more thorough answer once I get back to my computer, but the short of it is that the block diagonal expression matrix just contains the original expressions found in your separate AnnData’s. If you’d like to see the overlap in expression between two genes, there’s a function called “plot_expression_overlap”. If you’d like a score measuring the similarity in expression between two genes from different species, that information is stored in the refined homology graph. Currently, I don’t have a function to easily extract that information but I can make one for you as soon as I’m back! Stay tuned

atarashansky commented 3 years ago

Hello! Thanks for your patience.

In samap==0.3.2, I added two new functions to determine the expression correlations between two genes. Check out the Querying gene mappings section in SAMap_vignette.ipynb.

Two new functions: sm.query_gene_pairs("gene id") <-- returns all genes connected to gene id sm.query_gene_pair("gene id 1", "gene id 2") <-- returns edge weight between gene id 1 and gene id 2.

Preferrably, the gene IDs are prepended with their species IDs. So, for example, hu_SOX2 instead of SOX2.

Let me know if this is helpful for you or if I didn't properly address your question.

jasminelmah commented 3 years ago

Hi!

Wow, thank you very much! These new functions will be very helpful and I look forward to trying them out. I really appreciate the time and effort you put into this.

On Mon, Nov 1, 2021 at 2:20 PM atarashansky @.***> wrote:

Hello! Thanks for your patience.

In samap==0.3.2, I added two new functions to determine the expression correlations between two genes. Check out the Querying gene mappings section in SAMap_vignette.ipynb.

Two new functions: sm.query_gene_pairs("gene id") <-- returns all genes connected to gene id sm.query_gene_pair("gene id 1", "gene id 2") <-- returns edge weight between gene id 1 and gene id 2.

Preferrably, the gene IDs are prepended with their species IDs. So, for example, hu_SOX2 instead of SOX2.

Let me know if this is helpful for you or if I didn't properly address your question.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/atarashansky/SAMap/issues/56#issuecomment-956475668, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJV26ECR7UQEOJWMZ4I7Z2TUJ3K7XANCNFSM5G3RBE7Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.