aristoteleo / dynamo-release

Inclusive model of expression dynamics with conventional or metabolic labeling based scRNA-seq / multiomics, vector field reconstruction and differential geometry analyses
https://dynamo-release.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
420 stars 59 forks source link

Input X contains NaN. #678

Open christophechu opened 6 months ago

christophechu commented 6 months ago

dyn.tl.cell_velocities(adata, method='kmc', other_kernels_dict={'transform': 'sqrt'}) raise 'Input X contains NaN.'

Sichao25 commented 6 months ago

Hi, thanks for raising this issue. Basically, this error occurs when there are NaN values in your dataset (Could be either in expression or velocity matrix). Could you share additional details about your issue(e.g. the dataset you are working with, previous steps you've performed, relevant traceback ...)?

christophechu commented 6 months ago

It should by definition not have any NA value. But I don't know why this problem has arisen.

adata_raw = adata.copy() preprocessor = dyn.pp.Preprocessor(cell_cycle_score_enable=True) preprocessor.config_monocle_recipe(adata_raw) preprocessor.filter_cells_by_outliers_kwargs["keep_filtered"] = False preprocessor.filter_genes_by_outliers_kwargs["inplace"]=True preprocessor.select_genes_kwargs["n_top_genes"]=3000 preprocessor.preprocess_adata_monocle(adata_raw) adata = adata_raw.copy() adata.raw = adata del adata_raw

AnnData object with n_obs × n_vars = 150931 × 7305 obs: 'file_name', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_20_genes', 'total_counts_mt', 'log1p_total_counts_mt', 'pct_counts_mt', 'outlier_5', 'mt_outlier', 'scDblFinder_score', 'scDblFinder_class', 'hybrid_score', 'hybrid_class', 'DoubletFinder_score', 'DoubletFinder_class', 'n_genes', 'Doublet_detect', 'patient', 'tissue', 'dataset', 'nGenes', 'nCounts', 'pMito', 'pass_basic_filter', 'Size_Factor', 'initial_cell_size', 'unspliced_Size_Factor', 'initial_unspliced_cell_size', 'spliced_Size_Factor', 'initial_spliced_cell_size', 'ntr' var: 'nCells', 'nCounts', 'pass_basic_filter', 'log_cv', 'log_m', 'score', 'frac', 'use_for_pca', 'ntr', 'use_for_dynamics' uns: 'CD4_subtypes_colors', 'diffmap_evals', 'neighbors', 'pp', 'velocyto_SVR', 'feature_selection', 'PCs', 'explained_varianceratio', 'pca_mean', 'vel_params_names', 'dynamics' obsm: 'X_diffmap', 'X_pca_harmony', 'X_umap', 'X_pca' varm: 'vel_params' layers: 'spliced', 'unspliced', 'X_unspliced', 'X_spliced', 'M_u', 'M_uu', 'M_s', 'M_us', 'M_ss', 'velocity_S' obsp: 'connectivities', 'distances', 'moments_con'

Sichao25 commented 5 months ago

Would you mind sharing the traceback information? This will help us better locate the error.

github-actions[bot] commented 2 months ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

Sichao25 commented 2 months ago

I noticed that you're using the KMC method, which has a known issue. Since the estimated velocity contains zero values, some NaNs are inevitably created during the calculation. I recommend using another method for now until we can optimize the KMC algorithm.