aristoteleo / dynamo-release

Inclusive model of expression dynamics with conventional or metabolic labeling based scRNA-seq / multiomics, vector field reconstruction and differential geometry analyses
https://dynamo-release.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
407 stars 59 forks source link

Separate two condition and fix the grid #675

Open feanaros opened 2 months ago

feanaros commented 2 months ago

Dear all, I have a dataset and I would to split conditions (ko and WT) to analyze separately them. The problem Is that Im not sure if there Is a way to compare in a statistically correct manner the Arrow directions of wt dataset and KO DATASET. Can you suggest me something?

Sichao25 commented 2 months ago

Hi, thank you for using Dynamo and raising the question. I assume you're referring to the cell vectors when mentioning "arrow directions". Cell vectors are the visualization of the RNA velocities projected to low dimensions. You can verify velocities by using methods like dyn.tl.confident_cell_velocities or dyn.tl.cell_wise_confidence(). Also it's important to make sure that no bias is introduced in previous steps like preprocessing and velocity estimation. Let me know if this helps your question.

feanaros commented 2 months ago

Hi, thank you for you answer. Im new on python because I usually use R. I used loom files with clustering and dimensionality reduction imported from my Seurat obj. However, can you please explain me step by step how to check if there Is no bias in preprocessing step? I'm not sure I'm doing It correctly.

Sichao25 commented 2 months ago

Sure. Could you provide a brief overview of your workflow and goal? Are you trying to split one dataset based on condition, estimate the velocities separately and then compare?

feanaros commented 2 months ago

Of course. I have a dataset of 9000 cells (5 clusters) composed of ~5000 wild type and ~4000 knock out. I have barcode ids, metadata and clustered Cell id. I was able to produce a Anndata obj taking only WT cells and do the canonical analysis until Rna velocity and gene acceleration. Then I did the same with ko. So I want to know if my analysis make Sense, if there Is a way to compare the velocity from ko and WT or Is Just an artifact due to the nr of cells.

Sichao25 commented 2 months ago

Your approach should be fine as long as you maintain consistency in preprocessing both datasets and address any potential data integration issues (e.g. batch effect if any, labeling time difference if using metabolic labeling data).

Velocities are the time derivative of the gene expression state. The recommended next step is to reconstruct the vector field with dyn.vf.VectorField(). The method of comparing velocities depends on the specific biological question you aim to address. For example, we can perform cell fate prediction on both datasets by dyn.pd.fate() to investigate the impact of knockout experiments. It will also be interesting to try our own knockout function dyn.pd.KO() and in silico perturbation dyn.pd.perturbation() on the WT dataset. Helpful visualizations include dyn.pl.streamline_plot(), dyn.pl.cell_wise_vectors(), dyn.pl.topography(), dyn.mv.StreamFuncAnim(). Let me know if there's a particular use case you're interested.

feanaros commented 2 months ago

i am interested in dyn.pd.fate() . how can visualize the results in both conditions?

Sichao25 commented 2 months ago

The most common approach is to visualize cell fate predictions through animation. For example, if you have obtained velocities from dynamic analyses, then the next steps will be:

import dynamo as dyn
import matplotlib
import numpy as np
from IPython.core.display import display, HTML
from matplotlib import animation

dyn.tl.cell_velocities(adata)
dyn.vf.VectorField(adata, M=1000, basis='umap')
dyn.pl.streamline_plot(adata, color=['your_cell_type_key'], basis='umap')

# Cell fate prediction
progenitor = adata.obs_names[adata.obs.your_cell_type_key.isin(['your_progenitor_key'])]
dyn.pd.fate(adata, basis='umap', init_cells=progenitor, interpolation_num=50,  direction='forward')

# Generate animation
fig, ax = plt.subplots()
ax = dyn.pl.topography(adata, color='your_cell_type_key', ax=ax, save_show_or_return='return')
instance = dyn.mv.StreamFuncAnim(adata=adata, color='your_cell_type_key', ax=ax)
matplotlib.rcParams['animation.embed_limit'] = 2**128 # Ensure all frames will be embedded.
anim = animation.FuncAnimation(instance.fig, instance.update, init_func=instance.init_background,
                               frames=np.arange(100), interval=100, blit=True)
HTML(anim.to_jshtml()) # embed to jupyter notebook

You can do the same thing to your WT dataset and KO dataset. Detailed examples can be found in our documents.

feanaros commented 2 months ago

Thank you, I've tried It, great result!! However, I would also.compare acceleration of some genes in different conditions. There Is a way to compare the scale values? I noticed that the numbers of acceleration Is very low respect to curvature and velocity.

Sichao25 commented 2 months ago

I'm glad this has been helpful. It is common for the acceleration numbers to be low compared to the curvature and velocity. You can visualize the acceleration distribution with dyn.pl.acceleration(). This is equivalent to plotting umap and setting color to your acceleration key ("acceleration_pca" or "acceleration_umap"). You can also combine the acceleration with any other plots like dyn.pl.streamline_plot() or dyn.pl. cell_wise_vectors() to get more insights from interested regions.

feanaros commented 2 months ago

Dear @Sichao25, I produced acceleration plot for One gene "POSTN" in WT and in ko dataset. Acceleration value in WT Is 0.008 and in Ko Is 0.015. the scale are Absolute or relative values? I mean, are they statistically comparable? If yes, in which way? Thank you

Sichao25 commented 2 months ago

That's an interesting question. In your case, I feel like accelerations are more relative than absolute. Personally, I don't recommend comparing the values of acceleration across two datasets. Instead, it's preferable to compare the insights derived from the accelerations themselves.

feanaros commented 2 months ago

ok, that's fine. Last question for you: is it possible to do the animation picture (with the cells that move themselves along a trajectory) using as umap background the umap without topology spots? if yes,can you show me the code? EDIT: running again the notebook, topography poits have changed and the animation is different. is it normal? hoping for your help

Sichao25 commented 2 months ago

Yes, you can specify with the parameter terms to customize the elements on your plot. For example, dyn.pl.topography(adata, basis='umap', n=100, color=['Cell_type'], terms=['streamline']).

Regarding the second question, topography points may vary in different environments. Umap itself also varies with different random seeds. However, they are expected to be consistent if you are using the same environment and the same seed.

feanaros commented 2 weeks ago

hi @Sichao25 again, another question: in acceleration plot, can I set manually the scale? (xlim,ylim) I would, at least, have the same values. thank you

Sichao25 commented 2 weeks ago

Sure. The first step is to set deaxis=False in dyn.pl.acceleration(), which enables the display of axis. (Just a little more tip, you can pass extra Matplotlib parameters todyn.pl.acceleration(), and they will be forwarded to the Matplotlib scatter plots ). To set the scale, a common method is to use set_xlim and set_ylim. You can retrieve the axes object from Dynamo function and manually adjust the scale with matplotlib. The following code works for me with matplotlib 3.7, but it may vary depending on your Matplotlib version.

ax = dyn.pl.acceleration(adata, basis="umap", deaxis=False, save_show_or_return="return")
ax.set_xlim(xmin=1, xmax=10)
ax.set_ylim(ymin=1, ymax=10)
plt.show()

Let me know if this solves your question!

feanaros commented 2 weeks ago

dear @Sichao25 , my code was the following ( I didn't use dyn.pl.acceleration) dyn.pl.umap(ldata, color=['Postn', 'Otx2', "Tshz2", "Fn1"], layer='acceleration', frontier=True)

and I got this with error: image

and the version is

import matplotlib
print(matplotlib.__version__)
3.8.0

if I follow your code,


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[60], line 7
      3 import dynamo as dyn
----> 7 ax = dyn.pl.acceleration(ldata, basis="umap", color=['Postn', 'Otx2', "Tshz2", "Fn1"], layer='acceleration', frontier=True, deaxis=False, save_show_or_return="return")

     10 ax.set_xlim(xmin=1, xmax=10)

File ~/anaconda3/lib/python3.11/site-packages/dynamo/plot/vector_calculus.py:261, in acceleration(adata, basis, color, frontier, *args, **kwargs)
    259 color_ = [acc_key]
    260 if not np.any(adata.obs.columns.isin(color_)):
--> 261     raise ValueError(
    262         f"{acc_key} is not existed in .obs, try run dyn.tl.acceleration(adata, basis='{acc_key}') first."
    263     )
    265 adata.obs[acc_key] = adata.obs[acc_key].astype("float")
    266 adata_ = adata[~adata.obs[acc_key].isna(), :]

ValueError: acceleration_umap is not existed in .obs, try run dyn.tl.acceleration(adata, basis='acceleration_umap') first.

but...


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[63], line 7

----> 7 dyn.tl.acceleration(ldata, basis='umap')

     10 ax = dyn.pl.acceleration(ldata, basis="umap", color=['Postn', 'Otx2', "Tshz2", "Fn1"], layer='acceleration', frontier=True, deaxis=False, save_show_or_return="return")

AttributeError: module 'dynamo.tl' has no attribute 'acceleration'
Sichao25 commented 2 weeks ago

The default return value For dyn.pl.umap() is None, so you probably need to set save_show_or_return="return". The default return value is None.

About the second issue, it seems that there is an error in our error message. Sorry for that, it should be module.vf.acceleration instead of module.tl.acceleration. By the way, you don't have to follow my code if you have already calculated acceleration for other basis like acceleration_pca in your data. In that case, you can do dyn.pl.acceleration(adata, basis="pca", deaxis=False, save_show_or_return="return")

feanaros commented 2 weeks ago

the picture probably truncated the whole code, because It was set correctly:

import matplotlib.pyplot as plt
import dynamo as dyn

ax = dyn.pl.acceleration(ldata, basis="umap", color=['Postn', 'Otx2', "Tshz2", "Fn1"], layer='acceleration', frontier=True, deaxis=False, save_show_or_return="return")

ax.set_xlim(xmin=1, xmax=10)
ax.set_ylim(ymin=1, ymax=10)

plt.show()

and the error was the one in the image I posted :(

Sichao25 commented 2 weeks ago

I see. I just realized that I might misunderstand your use case in my last response. In your code, I see you are trying to use acceleration_umapas the basis and color it by different genes. If you plan to do that, umap()or scatters() would be the better functions to use. The acceleration plotting function only provides quick access to scatter plots colored by acceleration.

When returning multiple axes, the return value will be a list. Using plt.setp could be a better option than set_xlim/set_ylim.

Could you tell me a little more information on what you are trying to achieve? Also, would you mind sharing your previous code about acceleration?

feanaros commented 2 weeks ago

Yes, sure. You are looking at the accelerations of genes of interest in the KO genotype. Then, I ran the same code for the genes in WT cells. Well, I know that value scales are relative and not absolute, but I just want to set the same scale for KO and WT. Look at my results:

image I would like to set the same scale for (for example) Postn in both WT and KO. Is it possible? This would help to better see the differences in acceleration. My code is the following:

dyn.vf.VectorField(ldata,
                   basis='pca',
                   M=100)
dyn.vf.rank_velocity_genes(ldata,
                           groups='Cluster',
                           vkey="velocity_S");
rank_speed = ldata.uns['rank_velocity_S'];
rank_abs_speed = ldata.uns['rank_abs_velocity_S'];
dyn.vf.acceleration(ldata, basis='pca')
dyn.vf.rank_acceleration_genes(ldata,
                               groups='Cluster',
                               akey="acceleration",
                               prefix_store="rank");
rank_acceleration = ldata.uns['rank_acceleration'];
rank_abs_acceleration = ldata.uns['rank_abs_acceleration'];
dyn.pl.umap(ldata, color=['Postn', 'Otx2', "Tshz2", "Fn1"], layer='acceleration', frontier=True)

Is it correct?