aristoteleo / dynamo-release

Inclusive model of expression dynamics with conventional or metabolic labeling based scRNA-seq / multiomics, vector field reconstruction and differential geometry analyses
https://dynamo-release.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
417 stars 59 forks source link

Filtering fixed points based on their confidence #454

Closed superlsd closed 1 year ago

superlsd commented 1 year ago

Hi,

When I calculate the fixed points of my single-cell datasets I have some attractor points that are outside of the regions of my data.

These fixed points have a strong effect on the dynamics of my system, but they have very low confidence values (both from the visualization and by checking their actual number with: adata.uns['VecFld_umap']['confidence'].

I know that I can change quite a lot of the topography by changing many parameters and I played around with: M and min_vel_corr in dyn.vf.VectorField()

However, I would like to know whether it would be possible to remove some fixed points with low confidence.

I didn't find any option for it, but by looking at your code, that would be a step that could be implemented in class VectorField2D.

Are you planning to implement this option or is there a shortcut to exclude fixed points that are below a certain confidence?

Thanks a lot in advance, Salvo D. Lombardo

Xiaojieqiu commented 1 year ago

Dear @superlsd, sorry for the late reply. the confidence can be obtained from adata.uns['VecFld_umap']['confidence'].

The calculation of fixed points is very tricky mathematically. It requires the vector field to have 1). exactly zero velocity 2). the jacobian at those points to satisfy restrict requirements as written in our method section. Given the fact that our vector field is learned directly from single cell data which is rather noisy, this will often result in many attractors. Thus you will need to manually select the attractor of your interests.

Importantly, the confidence is simply calculated as how close the identified attractor close to the observed data points.

superlsd commented 1 year ago

Dear @Xiaojieqiu, many thanks for your answer :)

We were aware of how to obtain the confidence of each fixed point, and we removed the ones with low confidence. However, when plotting the topography the arrows of the vector field point to regions of the UMAP space where there were fixed points previously... Is there a way to exclude low-confidence fixed points from the dynamics?

Xiaojieqiu commented 1 year ago

Hi @superlsd what do you mean by when plotting the topography the arrows of the vector field point to regions of the UMAP space where there were fixed points previously?

The vector field won't be changed if you just remove the fixed points because that doesn't affect the learned vector field. As I have explained to you in the above, because of the noise from the data, the vector field we learned will result in many more fixed points intrinsically. Also fix point calculation can be difficult even for a high dimension synthetic system with known equations. There are attempts from us to "stablize" the vector field but that is not ready yet and I am not sure it will fully resolve the "too many fixed point" issue.

If you just want to know which cell type is terminal or progenitors, other metrics like divergence that can be computed from the vector field is good. Also, the key innovation of dynamo lies in the facts it can be used to identify cell-dependent regulatory network, and in silico perturbation and least action path predictions

superlsd commented 1 year ago

Dear @Xiaojieqiu, many thanks for your answer and clear explanation.

I was indeed looking for a way to "stabilize" the learned vector field, or if it would be possible, to impose a confidence threshold of the identified fixed point in the vector field at the step of the construction of the vector field.

For example, if more than 30% of the identified fixed points have a confidence lower than a certain cutoff, then learning again the vector field with different parameters. At the moment, I am doing it manually by changing combinations of parameters of the SparseVFC algorithm, and more specifically playing around with the number of basis (M) and the minimal threshold for the cosine correlation between input velocities and learned velocities (min_vel_corr) in dyn.vf.VectorField().

Is there a better way to do it automatically in dynamo or do you have better recommendations to avoid strong attractors in regions of the UMAP space where there aren't cells (fixed points 287 and 0 in the attached plot)?

Screenshot 2023-03-17 at 11 12 29

Thanks a lot again, Salvo D.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days