theislab / cellrank

CellRank: dynamics from multi-view single-cell data
https://cellrank.org
BSD 3-Clause "New" or "Revised" License
341 stars 47 forks source link

Is it possible to set the initial state manually? #787

Closed hyjforesight closed 2 years ago

hyjforesight commented 2 years ago

... Hello CellRank, When running the low-level mode using GPCCA, the terminal state can be mannually set by estimator.set_terminal_states_from_macrostates() or estimator.set_terminal_states(). But GPCCA cannot set the initial state. Could you please tell us whether there is a way to set the initial state mannually in CellRank? Thanks! Best, YJ

michalk8 commented 2 years ago

GPCCA has 2 hidden functions, GPCCA._compute_initial_states and GPCCA._set_initial_states_from_macrostates, which work exactly as they terminal_states counterpart, but currently, there's no set_initial_states.

hyjforesight commented 2 years ago

Thanks for the information!

flde commented 2 years ago

@michalk8 Many thanks for pointing out the hidden functions. A bit off-topic but highly appreciated your help about how to compute the initial states with the GPCCA. In the tutorial it says we have to set backward=True for the estimator and (?) re-compute compute_schur() and macrostates()? Many thanks!

michalk8 commented 2 years ago

A bit off-topic but highly appreciated your help about how to compute the initial states with the GPCCA. In the tutorial it says we have to set backward=True for the estimator and (?) re-compute compute_schur() and macrostates()? Many thanks!

If by compute the initial states with the GPCCA you mean use the states set by e.g. GPCCA._compute_initial_states, then unfortunately, there isn't an easy way - you'd have to overwrite I think 1 attribute, can post the code here if this is what you want.

The intended approach is to change the directionality of the kernel objects. You'd have to re-initialize the kernel objects with backward = True e.g.:

# backward doesn't matter in this case, but has to still be set, will not be required in #791 
ck = cr.tl.kernels.ConnectivityKernel(adata, backward=True)
ck = ck.compute_transition_matrix()
vk = cr.tl.kernels.VelocityKernel(adata, backward=True)
# backward_mode can be either 'transpose' or 'negate'
vk = vk.compute_transition_matrix(backward_mode=...)
k = (0.8 * vk + 0.2 * ck)

g = cr.tl.estimators.GPCCA(k)
# continue as usual, the terminal states correspond to root states, not final states

Note that almost surely, the initial states you got using GPCCA with a forward kernel (backward=False) will be different than the terminal states from GPCCA with a backward kernel (outlined above).

yourann3 commented 1 year ago

Hi,

Thanks for answering in this post. It helped me a lot.

I am following the real time series tutorial and using wot kernel + connectivity kernel to compute terminal and initial states. My data has 3 real-time points: 0.5, 1.0, 2.0. The terminal states computation worked okay and I get expected macrostates. However, I doubted if I successfully compute initial states. The macrostates computed in both forward and backward modes are the same... also random walk simulation looks fairly similar... Can you help me to have a look at my code?

random walk when backward=false

Screenshot 2023-06-25 at 18 20 23

random walk when backward=true

Screenshot 2023-06-25 at 18 19 35

Below is my code to compute initial states: (for terminal states, I removed backward=true argument)

sc.pp.neighbors(adata, random_state=0)

# Estimate growth rate 
wkb = WOTKernel(adata, time_key="Time", backward=True)
wkb.compute_initial_growth_rates(organism="mouse", key_added="growth_rate_init_bwd")

# Compute transition matrix
wkb.compute_transition_matrix(
    growth_iters=3, growth_rate_key="growth_rate_init_bwd", last_time_point="connectivities"
)

# Simulate random walks
wkb.plot_random_walks(
    n_sims=300,
    max_iter=200,
    start_ixs={"Time": 0.0},
    basis="umap",
    c="Time",
    legend_loc="none",
    linealpha=0.5,
    dpi=150,
)

# Compute macrostate
ckb = ConnectivityKernel(adata, backward=True) 
ckb.compute_transition_matrix()
combined_kernel_A_WT_bwd = 0.9 * wkb + 0.1 * ckb

# Initiate GPCCA estimator
gb = GPCCA(combined_kernel_adata_bwd)

# Plot Schur decomposition
gb.compute_schur()
gb.plot_spectrum(real_only=True)

# Compute macrostates, and show top 30 most confidently assigned cells
gb.compute_macrostates(n_states=2, cluster_key="cell_type")
gb.plot_macrostates(discrete=True, basis="umap", legend_loc="right")

# set initial state
gb.set_terminal_states_from_macrostates(["celltype1"])

Thank you!

Youran

Marius1311 commented 1 year ago

Hi @yourann3, you could try computing initial states directly from the forwards estimator object by computing a few marcostates, checking how much these overlap with your time-points, and manually choosing one that overlaps with only day 0.5 cells. When you simulate random walks for the backwards Markov chain, maker sure to adapt the value of start_ixs={"Time": 0.0}. Also, the computation of initial states should be much easier in version 2, which is pretty much available on the main branch, and documented here: https://cellrank.readthedocs.io/en/latest/

See for example the new tutorial about the RealTimeKernel: https://cellrank.readthedocs.io/en/latest/notebooks/tutorials/kernels/500_real_time.html

Or the new tutorial about initial and terminal states: https://cellrank.readthedocs.io/en/latest/notebooks/tutorials/estimators/600_initial_terminal.html