aristoteleo / dynamo-release

Inclusive model of expression dynamics with conventional or metabolic labeling based scRNA-seq / multiomics, vector field reconstruction and differential geometry analyses
https://dynamo-release.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
413 stars 58 forks source link

KeyError: 'M_us' #333

Closed stupidstupidstupidstupid closed 2 years ago

stupidstupidstupidstupid commented 2 years ago

Hello dynamo I met a bug when using dyn.tl.dynamics What should I do?

KeyError Traceback (most recent call last) Input In [5], in <cell line: 1>() ----> 1 dyn.tl.dynamics(adata, model="stochastic") 2 dyn.tl.reduceDimension(adata, n_pca_components=30) 3 dyn.tl.cell_velocities(adata, method="pearson", other_kernels_dict={"transform": "sqrt"})

File ~\AppData\Roaming\Python\Python39\site-packages\dynamo\tools\dynamics.py:404, in dynamics(adata, filter_gene_mode, use_smoothed, assumption_mRNA, assumption_protein, model, est_method, NTR_vel, group, protein_names, concat_data, log_unnormalized, one_shot_method, fraction_for_deg, re_smooth, sanity_check, del_2nd_moments, cores, tkey, **est_kwargs) 389 if model.lower() == "stochastic" or use_smoothed: 390 moments(subset_adata) 391 ( 392 U, 393 Ul, 394 S, 395 Sl, 396 P, 397 US, 398 U2, 399 S2, 400 t, 401 normalized, 402 ind_for_proteins, 403 assump_mRNA, --> 404 ) = get_data_for_kin_params_estimation( 405 subset_adata, 406 has_splicing, 407 has_labeling, 408 model, 409 use_smoothed, 410 tkey, 411 protein_names, 412 log_unnormalized, 413 NTR_vel, 414 ) 416 validbools = valid_bools.copy() 417 if sanity_check and experiment_type.lower() in ["kin", "deg"]:

File ~\AppData\Roaming\Python\Python39\site-packages\dynamo\tools\utils.py:987, in get_data_for_kin_params_estimation(subset_adata, has_splicing, has_labeling, model, use_moments, tkey, protein_names, log_unnormalized, NTR_vel) 984 t = None 985 if model == "stochastic": 986 US, U2, S2 = ( --> 987 subset_adata.layers["M_us"].T, 988 subset_adata.layers["M_uu"].T, 989 subset_adata.layers["M_ss"].T, 990 ) 992 return ( 993 U, 994 Ul, (...) 1004 assumption_mRNA, 1005 )

File E:\anaconda\lib\site-packages\anndata_core\aligned_mapping.py:113, in AlignedViewMixin.getitem(self, key) 111 def getitem(self, key: str) -> V: 112 return as_view( --> 113 _subset(self.parent_mapping[key], self.subset_idx), 114 ElementRef(self.parent, self.attrname, (key,)), 115 )

File E:\anaconda\lib\site-packages\anndata_core\aligned_mapping.py:148, in AlignedActualMixin.getitem(self, key) 147 def getitem(self, key: str) -> V: --> 148 return self._data[key]

KeyError: 'M_us'

stupidstupidstupidstupid commented 2 years ago

|-----> recipe_monocle_keep_filtered_cells_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_filtered_cells_key=True |-----> recipe_monocle_keep_filtered_genes_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_filtered_genes_key=True |-----> recipe_monocle_keep_raw_layers_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_raw_layers_key=True |-----> apply Monocole recipe to adata... |-----> pp to uns in AnnData Object. |-----------> has_splicing to uns['pp'] in AnnData Object. |-----------> has_labling to uns['pp'] in AnnData Object. |-----------> splicing_labeling to uns['pp'] in AnnData Object. |-----------> has_protein to uns['pp'] in AnnData Object. |-----> ensure all cell and variable names unique. |-----> ensure all data in different layers in csr sparse matrix format. |-----> ensure all labeling data properly collapased |-----------> tkey to uns['pp'] in AnnData Object. |-----------> experiment_type to uns['pp'] in AnnData Object. |-----? dynamo detects your data is size factor normalized and/or log transformed. If this is not right, plese set `normalized = False. |-----> filtering cells... |-----> pass_basic_filter to obs in AnnData Object. |-----> 59705 cells passed basic filters. |-----> filtering gene... |-----> pass_basic_filter to var in AnnData Object. |-----? No layers exist in adata, skipp filtering by shared counts |-----> 13951 genes passed basic filters. |-----> calculating size factor... |-----> selecting genes in layer: X, sort method: SVR... |-----> frac to var in AnnData Object. |-----------> norm_method to uns['pp'] in AnnData Object. |-----> applying PCA ... |-----> pca_fit to uns in AnnData Object. |-----> cell cycle scoring... |-----> computing cell phase... |-----? Dynamo is not able to perform cell cycle staging for you automatically. Since dyn.pl.phase_diagram in dynamo by default colors cells by its cell-cycle stage, you need to set color argument accordingly if confronting errors related to this. |-----> [recipe_monocle preprocess] in progress: 100.0000% |-----> [recipe_monocle preprocess] finished [53.0898s] |-----> dynamics_del_2nd_moments_key is None. Using default value from DynamoAdataConfig: dynamics_del_2nd_moments_key=False |-----------> removing existing M layers:[]... |-----------> making adata smooth... |-----> calculating first/second moments... |-----> [moments calculation] in progress: 100.0000% |-----> [moments calculation] finished [77.3660s]

Xiaojieqiu commented 2 years ago

Hi @stupidstupidstupidstupid

what data are you using? a regular 10x based dataset with splicing and unsplicing layers?

can you run dyn.tl.moments(adata) first, then type adata and copy the output here? I want to see what layers you have in your resultant adata object

stupidstupidstupidstupid commented 2 years ago

In [2] from IPython.core.display import display, HTML

import warnings warnings.filterwarnings('ignore')

pancreas_genes = [ "Hes1", "Nkx6-1", "Nkx2-2", "Neurog3", "Neurod1", "Pax4", "Pax6", "Arx", "Pdx1", "Ins1", "Ins2", "Ghrl", "Ptf1a", "Iapp", "Isl1", "Sox9", "Gcg", ]

import dynamo as dyn filename="C:/Users/ALIENWARE/Downloads/vento18_10x.processed.h5ad" dyn.get_all_dependencies_version() dyn.configuration.set_figure_params("dynamo", background="white") adata = dyn.read_h5ad(filename)

dyn.configuration.set_figure_params(dynamo=True, background='white', fontsize=8, figsize=(6, 4), dpi=600, dpi_save=800, frameon=None, vector_friendly=True, color_map=None, format='pdf', transparent=False, ipython_format='png2x')

dyn.pp.recipe_monocle(adata, n_top_genes=4000, fg_kwargs={"shared_count": 20}, genes_to_append=pancreas_genes) dyn.tl.moments(adata) adata

|-----> recipe_monocle_keep_filtered_cells_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_filtered_cells_key=True |-----> recipe_monocle_keep_filtered_genes_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_filtered_genes_key=True |-----> recipe_monocle_keep_raw_layers_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_raw_layers_key=True |-----> apply Monocole recipe to adata... |-----> pp to uns in AnnData Object. |-----------> has_splicing to uns['pp'] in AnnData Object. |-----------> has_labling to uns['pp'] in AnnData Object. |-----------> splicing_labeling to uns['pp'] in AnnData Object. |-----------> has_protein to uns['pp'] in AnnData Object. |-----> ensure all cell and variable names unique. |-----> ensure all data in different layers in csr sparse matrix format. |-----> ensure all labeling data properly collapased |-----------> tkey to uns['pp'] in AnnData Object. |-----------> experiment_type to uns['pp'] in AnnData Object. |-----? dynamo detects your data is size factor normalized and/or log transformed. If this is not right, plese set `normalized = False. |-----> filtering cells... |-----> pass_basic_filter to obs in AnnData Object. |-----> 59705 cells passed basic filters. |-----> filtering gene... |-----> pass_basic_filter to var in AnnData Object. |-----? No layers exist in adata, skipp filtering by shared counts |-----> 13951 genes passed basic filters. |-----> calculating size factor... |-----> selecting genes in layer: X, sort method: SVR... |-----> frac to var in AnnData Object. |-----------> norm_method to uns['pp'] in AnnData Object. |-----> applying PCA ... |-----> pca_fit to uns in AnnData Object. |-----> cell cycle scoring... |-----> computing cell phase... |-----? Dynamo is not able to perform cell cycle staging for you automatically. Since dyn.pl.phase_diagram in dynamo by default colors cells by its cell-cycle stage, you need to set color argument accordingly if confronting errors related to this. |-----> [recipe_monocle preprocess] in progress: 100.0000% |-----> [recipe_monocle preprocess] finished [55.5225s] |-----> calculating first/second moments... |-----> [moments calculation] in progress: 100.0000% |-----> [moments calculation] finished [86.8181s]

Out [2] AnnData object with n_obs × n_vars = 59705 × 25875 obs: 'CellType', 'Stage', 'n_counts', 'log1p_n_counts', 'n_genes', 'log1p_n_genes', 'percent_mito', 'percent_ribo', 'percent_hb', 'percent_top50', 'Location', 'nGenes', 'nCounts', 'pMito', 'pass_basic_filter', 'Size_Factor', 'initial_cell_size' var: 'gene_ids', 'mito', 'ribo', 'hb', 'n_counts', 'n_cells', 'n_genes', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'nCells', 'nCounts', 'pass_basic_filter', 'log_cv', 'log_m', 'score', 'frac', 'use_for_pca' uns: 'pp', 'velocyto_SVR', 'PCs', 'explained_varianceratio', 'pca_mean', 'pca_fit', 'feature_selection', 'cell_phase_genes' obsm: 'X_umap_hm', 'X_pca', 'X' obsp: 'moments_con'

P.S I'm just a stupid student and I know few about computational biology. Please don't mind if I asked some stupid questions.

Xiaojieqiu commented 2 years ago

ah, I see. you don't have spliced and unspliced layers for your adata object. dynamo assumes either splicing data or labeling data is provided. please check kb-python to generate the spliced and unspliced matrix for your data first: https://github.com/pachterlab/kb_python

alexpiccinich commented 2 years ago

Hi, Below is my code I am trying to run but get KeyError: 'M_us'. dyn.tl.moments(adata) alone does not give key error message. dyn.tl.dynamics(adata) is the first line of code to bring up KeyError : 'M_us"

import dynamo as dyn import anndata

dyn.pp.recipe_monocle(adata) dyn.tl.dynamics(adata) dyn.tl.moments(adata) dyn.tl.cell_velocities(adata) dyn.tl.cell_velocities(adata, basis='pca') dyn.tl.cell_wise_confidence(adata) dyn.vf.VectorField(adata)

This is the output.

|-----> setting visualization default mode in dynamo. Your customized matplotlib settings might be overritten. /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/nxviz/init.py:18: UserWarning: nxviz has a new API! Version 0.7.3 onwards, the old class-based API is being deprecated in favour of a new API focused on advancing a grammar of network graphics. If your plotting code depends on the old API, please consider pinning nxviz at version 0.7.3, as the new API will break your old code.

To check out the new API, please head over to the docs at https://ericmjl.github.io/nxviz/ to learn more. We hope you enjoy using it!

(This deprecation message will go away in version 1.0.)

warnings.warn( |-----> recipe_monocle_keep_filtered_cells_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_filtered_cells_key=True |-----> recipe_monocle_keep_filtered_genes_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_filtered_genes_key=True |-----> recipe_monocle_keep_raw_layers_key is None. Using default value from DynamoAdataConfig: recipe_monocle_keep_raw_layers_key=True |-----> apply Monocole recipe to adata... |-----> pp to uns in AnnData Object. |-----------> has_splicing to uns['pp'] in AnnData Object. |-----------> has_labling to uns['pp'] in AnnData Object. |-----------> splicing_labeling to uns['pp'] in AnnData Object. |-----------> has_protein to uns['pp'] in AnnData Object. |-----> ensure all cell and variable names unique. |-----> ensure all data in different layers in csr sparse matrix format. |-----> ensure all labeling data properly collapased |-----------> tkey to uns['pp'] in AnnData Object. |-----------> experiment_type to uns['pp'] in AnnData Object. |-----> filtering cells... |-----> pass_basic_filter to obs in AnnData Object. |-----> 12330 cells passed basic filters. |-----> filtering gene... |-----> pass_basic_filter to var in AnnData Object. |-----? No layers exist in adata, skipp filtering by shared counts |-----> 4827 genes passed basic filters. |-----> calculating size factor... /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/dynamo/preprocessing/utils.py:322: FutureWarning: Passing a set as an indexer is deprecated and will raise in a future version. Use a list instead. new_df = origin_df.merge(diff_df[_columns], how="left", left_index=True, right_index=True) |-----> selecting genes in layer: X, sort method: SVR... /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/dynamo/preprocessing/utils.py:322: FutureWarning: Passing a set as an indexer is deprecated and will raise in a future version. Use a list instead. new_df = origin_df.merge(diff_df[_columns], how="left", left_index=True, right_index=True) |-----> frac to var in AnnData Object. |-----> size factor normalizing the data, followed by log1p transformation. |-----> Set to normalized data |-----> applying PCA ... |-----> pca_fit to uns in AnnData Object. |-----> cell cycle scoring... |-----> computing cell phase... |-----? Dynamo is not able to perform cell cycle staging for you automatically. Since dyn.pl.phase_diagram in dynamo by default colors cells by its cell-cycle stage, you need to set color argument accordingly if confronting errors related to this. |-----> [recipe_monocle preprocess] in progress: 100.0000% |-----> [recipe_monocle preprocess] finished [11.2565s] |-----> dynamics_del_2nd_moments_key is None. Using default value from DynamoAdataConfig: dynamics_del_2nd_moments_key=False |-----------> removing existing M layers:[]... |-----------> making adata smooth... |-----> calculating first/second moments... |-----> [moments calculation] in progress: 100.0000% |-----> [moments calculation] finished [86.8876s] Traceback (most recent call last): File "loadDataExample.py", line 25, in dyn.tl.dynamics(adata) File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/dynamo/tools/dynamics.py", line 404, in dynamics ) = get_data_for_kin_params_estimation( File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/dynamo/tools/utils.py", line 987, in get_data_for_kin_params_estimation subset_adata.layers["M_us"].T, File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/anndata/_core/aligned_mapping.py", line 113, in getitem _subset(self.parent_mapping[key], self.subset_idx), File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/anndata/_core/aligned_mapping.py", line 148, in getitem return self._data[key] KeyError: 'M_us'

This is the data I am using from this website. https://www.nature.com/articles/s41597-022-01236-2#Sec12

Xiaojieqiu commented 2 years ago

hi @alexpiccinich thanks for using dynamo, as I answered above, dynamo assumes using either splicing or labeling data so that we can estimate RNA velocity, followed vector field reconstruction and predictions. So if you only have the total RNA without spliced and unspliced layers, you will have this issue. Please use kb-python or others to generate these information. Thanks

alexpiccinich commented 2 years ago

When using dynamo is there a way to use the output of Cell Ranger Count v3.0.2 instead of using kb count? Can I modify the code of dynamo to use Cell Ranger Count v3.0.2?

alexpiccinich commented 2 years ago

I am following this tutorial using kb count and I keep getting this error. kb: error: unrecognized arguments: --lamanno

If you are familiar with this I would appreciate any help. Thank you.

Xiaojieqiu commented 2 years ago

please follow this tutorial to run kb-python: https://www.kallistobus.tools/tutorials/kb_velocity/python/kb_velocity/

and for kb python related questions, you can open an issue in the kb-python github repo. But this one may be helpful for you: https://github.com/pachterlab/kb_python/issues/56

regarding cellranger, if cell ranger doesn't output spliced / unspliced layers, then as I mentioned before, you cannot perform velocity and vector field analyses.

wangjiawen2013 commented 1 year ago

@Xiaojieqiu I can run dyn.tl.dynamics() successfully, except that:

|-----> ntr to var in AnnData Object. |-----> cell cycle scoring... |-----> computing cell phase... |-----? Dynamo is not able to perform cell cycle staging for you automatically. Since dyn.pl.phase_diagram in dynamo by default colors cells by its cell-cycle stage, you need to set color argument accordingly if confronting errors related to this. |-----> [recipe_monocle preprocess] in progress: 100.0000% |-----> [recipe_monocle preprocess] finished [18.1047s]

It seems that cell cycle staging cannot be performed. Could you give some help ?

Xiaojieqiu commented 1 year ago

this is just a warning so it should be fine. I guess your data is not from human or mice, right? In that case cell cycle staging will fail because the code use human and mouse cell cycle genes to do the staging