theislab / cellrank

CellRank: dynamics from multi-view single-cell data
https://cellrank.org
BSD 3-Clause "New" or "Revised" License
337 stars 45 forks source link

Visualizing lineage probabilities on a circle/simplex #457

Closed emdann closed 3 years ago

emdann commented 3 years ago

Hi, thanks a lot for this awesome toolkit! I was wondering whether you've looked into adding functionality to visualize absorption probabilities for more than 2 lineages on a circle/simplex, similarly to STEMNET.

I've found this very useful on my own dataset, adding the projection of probabilities to cartesian coordinates to obsm and plotting with sc.pl.embedding.

import scanpy as sc
import numpy as np
import cellrank as cr
import matplotlib.pyplot as plt
import math

def prob2simplex(adata):
    p_mat = adata.obsm['to_terminal_states'].X

    ## Assign each terminal state to an angle on the edge of a circle
    n_lin = p_mat.shape[1]
    angle_vec = [x*(360/n_lin) for x in range(n_lin)]
    angle_vec = [math.radians(x) for x in angle_vec]

    ## Transform to cartesian coordinates
    x2 = (p_mat * np.cos(angle_vec)).sum(1)
    y2 = (p_mat * np.sin(angle_vec)).sum(1)

    ## Add to embeddings
    adata.obsm["X_fate_simplex"] =  np.array([y2, x2]).T

adata = cr.datasets.pancreas_preprocessed() 

## Calculate lineage probabilities
cr.tl.terminal_states(adata, cluster_key='clusters', weight_connectivities=0.2)
cr.tl.lineages(adata)

## Add simplex embedding
prob2simplex(adata)

plt.rcParams["figure.figsize"] = [8,8]
sc.pl.embedding(adata, "X_fate_simplex", color="terminal_states",  frameon=False, size=40, legend_loc="on data")
sc.pl.embedding(adata, "X_fate_simplex", color="clusters_fine",  frameon=False, size=40, legend_loc="on data")

Output:

Screenshot 2020-12-18 at 08 42 43 Screenshot 2020-12-18 at 08 42 50
michalk8 commented 3 years ago

Hi @emdann thanks for your interest in our package and thanks for the reference implementation! We've discussed this and find it really nice addition to the package. I will start working on this - in the meantime, I have 2 questions:

emdann commented 3 years ago

Hi @michalk8, great to know. I was interested mostly in the coordinates, I am not too familiar with the other metrics from STEMNET. I think for visualization it would be ideal to have a permanent labelling of the position of the terminal states (like in the first plot above) when visualizing different columns of obs e.g. the pseudotime/latent time values.

michalk8 commented 3 years ago

@emdann I've implemented the function based on the code you've provided + added some functionality from STEMNET, the important one being the lineage order, as seen in the #459.

The code is currently on circular_projection branch if you want to try it out, though the docs are already visible here: https://cellrank--459.org.readthedocs.build/en/459/gen_modules/cellrank.pl.circular_projection.html#cellrank.pl.circular_projection

Let me know if you have any comments/improvements.

Marius1311 commented 3 years ago

Thanks a lot @michalk8! @emdann, would be really curious to hear your input/thoughts/what we can improve.

emdann commented 3 years ago

Hi @Marius1311 @michalk8. I've tried the circular_projection function on my own dataset and it looks really neat. Question: is the lineage priming degree shown in log scale as in the STEMNET paper?

Marius1311 commented 3 years ago

Hi @emdann I agree, @michalk8 did a really nice job here! Whether it's shown on a log scale is a question I think @michalk8 can answer better than me.

emdann commented 3 years ago

TBH I am not sure what the priming degree adds to the analysis. The absolute values between all lineages seems to be mostly dependent on the density of cells closer to a certain terminal states, which often has more to do with the sampling from the tissue of interest rather than the actual differentiation process. If taken for each lineage separately, the priming degree essentially corresponds to the absorption probabilities. In the STEMNET paper this metric was meaningful where they had a more uniform sampling of terminal states as cells were picked with FACS sorting. Does this make sense? Is there something crucial in the interpretation that I am missing?

michalk8 commented 3 years ago

Question: is the lineage priming degree shown in log scale as in the STEMNET paper?

It wasn't, but the latest commit fixes it.

TBH I am not sure what the priming degree adds to the analysis. The absolute values between all lineages seems to be mostly dependent on the density of cells closer to a certain terminal states, which often has more to do with the sampling from the tissue of interest rather than the actual differentiation process. If taken for each lineage separately, the priming degree essentially corresponds to the absorption probabilities. In the STEMNET paper this metric was meaningful where they had a more uniform sampling of terminal states as cells were picked with FACS sorting. Does this make sense? Is there something crucial in the interpretation that I am missing?

Thanks for pointing this out. The current intention was to include this just for completeness' sake.

michalk8 commented 3 years ago

If there any features you'd like to get added and/or any improvements for the plot, please let me know. Otherwise, I will merge this layer today.

Marius1311 commented 3 years ago

Thanks @michalk8 for the great work here!