aristoteleo / dynamo-release

Inclusive model of expression dynamics with conventional or metabolic labeling based scRNA-seq / multiomics, vector field reconstruction and differential geometry analyses
https://dynamo-release.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
414 stars 58 forks source link

Can I change 'X_umap_distances' and 'cosine_transition_matrix' to other arguments when doing LAP? #385

Closed hyjforesight closed 1 year ago

hyjforesight commented 2 years ago

Hello Dynamo, In the LAP tutorial, the example datasets has adata.obsp['X_umap_distances'] and adata.obsp['cosine_transition_matrix'].

adata = dyn.sample_data.hematopoiesis()
adata
obsp: 'X_umap_connectivities', 'X_umap_distances', 'connectivities', 'cosine_transition_matrix', 'distances', 'fp_transition_rate', 'moments_con', 'pca_ddhodge', 'perturbation_transition_matrix', 'umap_ddhodge'

And these 2 arguments are used for LAP analysis.

for i, start in enumerate(start_cell_indices):
    for j, end in enumerate(end_cell_indices):
        if start is not end:
            min_lap_t = True if i == 0 else False
            dyn.pd.least_action(adata, init_cells=[adata.obs_names[start[0]][0]], target_cells=[adata.obs_names[end[0]][0]],
                                min_lap_t=min_lap_t, basis="umap", adj_key="X_umap_distances", EM_steps=2)
            dyn.pl.least_action(adata, basis="umap")
            lap = dyn.pd.least_action(adata, init_cells=[adata.obs_names[start[0]][0]], target_cells=[adata.obs_names[end[0]][0]],
                                      min_lap_t=min_lap_t, basis="pca", adj_key="cosine_transition_matrix", EM_steps=2)
            dyn.pl.kinetic_heatmap(adata, genes=adata.var_names[adata.var['use_for_transition']], mode='lap',
                                   basis="pca", project_back_to_high_dim=True, color_map='coolwarm')

I have several questions. Could you please help me? Thanks!

Q1. Why does this function loop run dyn.pd.least_action() first by calling adj_key="X_umap_distances", and then run dyn.pd.least_action() again by calling adj_key="cosine_transition_matrix" to return it to lap?

Q2. What is the best argument for adj_key when basis="pca" for dyn.pd.least_action(), pearson_transition_matrix or cosine_transition_matrix or fp_transition_rate?

Q3. In my own conventional scRNA-seq data, I only have obsp as below, how can I generate X_umap_distances for running dyn.pd.least_action() on umap basis. If I cannot generate X_umap_distances, what arguments should I use for dyn.pd.least_action()?

adata # my own data
obsp: 'moments_con', 'distances', 'connectivities', 'pearson_transition_matrix', 'umap_ddhodge', 'pca_ddhodge', 'umap_distances', 'umap_connectivities'
dyn.pd.least_action(adata, init_cells=[adata.obs_names[start[0]][0]], target_cells=[adata.obs_names[end[0]][0]], min_lap_t=min_lap_t, basis="umap",
                     adj_key="X_umap_distances", EM_steps=2)    # which should be used for adj_key, umap_distances?
dyn.pl.least_action(adata, basis="umap")
lap = dyn.pd.least_action(adata, init_cells=[adata.obs_names[start[0]][0]], target_cells=[adata.obs_names[end[0]][0]], min_lap_t=min_lap_t, basis="pca",
                     adj_key="cosine_transition_matrix", EM_steps=2)    # which should be used for adj_key, pearson_transition_matrix?
Xiaojieqiu commented 2 years ago

Thanks for your great questions again. I will provide my answers to your questions in the following:

Q1. Why does this function loop run dyn.pd.least_action() first by calling adj_key="X_umap_distances", and then run dyn.pd.least_action() again by calling adj_key="cosine_transition_matrix" to return it to lap?

The first lap call runs on the umap space while the second one on the pca space. I find using X_umap_distances for umap LAP and cosine_transition_matrix for pca LAP calculation give the best result. This also intuitively make sense, because you want to initialize the LAP search with the shorted path in umap or the best transition defined by the cosine_transition_matrix in pca space.

Q2. What is the best argument for adj_key when basis="pca" for dyn.pd.least_action(), pearson_transition_matrix or cosine_transition_matrix or fp_transition_rate?

If I remember correctly cosine_transition_matrix gives the best results in our HSC data, but I think you may also try pearson_transition_matrix for other datasets. Really it is just depends on whether either of them (or any other novel kernels that will be developed by us or others) results in the best velocity embedding.

Q3. In my own conventional scRNA-seq data, I only have obsp as below, how can I generate X_umap_distances for running dyn.pd.least_action() on umap basis. If I cannot generate X_umap_distances, what arguments should I use for dyn.pd.least_action()?

if you run dyn.tl.neighbors(adata, basis='umap') you will be able to generate what you want.

Hope these help!

hyjforesight commented 2 years ago

Hello @Xiaojieqiu, Thanks for the response. For generating cosine_transition_matrix, do I need to add this kwarg other_kernels_dict={'transform': 'sqrt'}?

dyn.tl.cell_velocities(adata, basis='umap', n_neighbors = 10, method='cosine', other_kernels_dict={'transform': 'sqrt'})

Thanks!

Xiaojieqiu commented 2 years ago

Yes, that is required to stablize the velocity projection because sqrt to bring down the extreme velocity values.

hyjforesight commented 2 years ago

thanks @Xiaojieqiu ! Appreciate it.

hyjforesight commented 2 years ago

Hello @Xiaojieqiu, sorry for reopening this issue. I calculated the cosine_potential and pearson_potential', but I believe they do not represent the real cell potential.ddhodge_potential' looks more close to the real cell state. image image image In this case, do I need to change X_umap_distances to 'distances' for LAP in UMAP space? Which value of ddhodge can be used for the replacement of cosine_transition_matrix for LAP in PCA space?

adata    # my own data
obsp: 'moments_con', 'distances', 'connectivities', 'pearson_transition_matrix', 'cosine_transition_matrix', 'umap_ddhodge', 'pca_ddhodge'
dyn.pd.least_action(adata, init_cells=[adata.obs_names[start[0]][0]], target_cells=[adata.obs_names[end[0]][0]], min_lap_t=min_lap_t, basis="umap",
                     adj_key="X_umap_distances", EM_steps=2)    # do I need to change `X_umap_distances` to 'distances'?
dyn.pl.least_action(adata, basis="umap")
lap = dyn.pd.least_action(adata, init_cells=[adata.obs_names[start[0]][0]], target_cells=[adata.obs_names[end[0]][0]], min_lap_t=min_lap_t, basis="pca",
                     adj_key="cosine_transition_matrix", EM_steps=2)    # which value of ddhodge can be used for the replacement of cosine_transition_matrix?

Thanks! Best, YJ

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days