aristoteleo / dynamo-release

Inclusive model of expression dynamics with conventional or metabolic labeling based scRNA-seq / multiomics, vector field reconstruction and differential geometry analyses
https://dynamo-release.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
406 stars 59 forks source link

How to interpret the vectors and LAP beyond the embedding of UMAP? #387

Closed hyjforesight closed 1 year ago

hyjforesight commented 1 year ago

Hello Dynamo, I understand that a vector field is a continuous function that describes the cell fate in PCA or UMAP space. But how shall we interpret the vectors going outside of the UMAP? If cells are in the state that is represented by fixed point 15, where do these cells transit along the vectors? I know they finally go to the green cluster, but what's the meaning of they going outside during this transition? image

I understand that LAP is computed on the PCA space and projected back to the UMAP space. However, how shall we interpret the LAPs outside of the embedding of UMAP? image

Thanks! Best, YJ

Xiaojieqiu commented 1 year ago

thanks for the great questions! The following is my answer to your questions:

But how shall we interpret the vectors going outside of the UMAP? If cells are in the state that is represented by fixed point 15, where do these cells transit along the vectors? I know they finally go to the green cluster, but what's the meaning of they going outside during this transition?

The vector field is the "computer" (or the vector field learning algorithm)'s best guess of the entire dynamical system. So if you find vectors going outside of areas with sampled single cells that means the computer guesses the associate cell is unstable and there is low chances it will transit to areas of sample cells. This may indicate your data may not capture sufficient cells that cover the trajectory of this particular cell.

In your specific case, the fixed point 15 is a repulsor and cells will move out to become other cells. Dynamo's vector field predicts it may move towards fixed points 19 or 12. However in generally, such prediction is suggestive as there are infinity number of possible transitions for areas that have no cell sampled or are far away from the sampled regions.

I understand that LAP is computed on the PCA space and projected back to the UMAP space. However, how shall we interpret the LAPs outside of the embedding of UMAP?

LAP can be computed in PCA space and then projected to umap space when you use vector field learned in pca space. If you use a vector field learned in a umap space, you can also predict the LAP path in PCA space directly. The PCA space LAP path can be reverse projected back to the original gene expression space also.

Here the reason why LAP is going outside of sampled cells can be either numerical or potentially biologically meaingful. You should try to learn the vector field / LAP in PCA and check the gene expression kinetics along this path to see whether it actually makes any sense.

I will also answer your other question in the other issue

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days