theislab / destiny

R package for single cell and other data analysis using diffusion maps
https://theislab.github.io/destiny/
GNU General Public License v3.0
76 stars 12 forks source link

How do I identify genes that drive the branching process in diffusionmap? #13

Open MJ-Yang opened 5 years ago

MJ-Yang commented 5 years ago

Applying your wonderful functions to my dataset results in very interesting 3d structures.

I am curious about genes that appear/disappear during the branching processes or genes that change across pseudo-time (calculated through DPT).

Can you suggest or give ideas to do such works?

I appreciate your beautiful work! Thx!

flying-sheep commented 5 years ago

Thank you for the kind words. There’s nothing built in but a simple idea would be to do it like in this method’s example: https://theislab.github.io/destiny/reference/DPT-matrix-methods.html

alternatively you can use kbranches to isolate branches and do it with them. I hope this is helpful!

MJ-Yang commented 5 years ago

Thank you for your kind suggestions. However, in the example you showed, it seems that it shows variance of "known gene" (Dppa1) that is important for such differentiation process. But for my case, there are only few things known. So I'm quite not sure how to identify such important genes.. I might have not fully understood your answers to my question. Can you give me little bit more details about how to perform it? It would be reallly helpful! Thanks!

flying-sheep commented 5 years ago

You can of course find the genes with the maximum correlation to the pseudotime using apply:

> root <- random_root(dpt)
> corrs <- apply(mydata, 1, function(gene) abs(cor(gene, dpt[root, ], method = 'spearman')))
> sort(corrs)
     Tcfap2c        Sox13        Gata6        Sox17        Runx1        Gapdh 
0.0009383148 0.0809590089 0.1005298024 0.1022802449 0.1287270821 0.1609465741 
...
       Snail       Pecam1        Gata4          Fn1          Id2        Gata3 
0.6732298133 0.6996364156 0.7087127946 0.7174012356 0.7268814714 0.7467974394 

you can e.g. use names(which.max(corrs)) to find the gene with the strongest correlation. (in this case Gata3)