aristoteleo / dynamo-release

Inclusive model of expression dynamics with conventional or metabolic labeling based scRNA-seq / multiomics, vector field reconstruction and differential geometry analyses
https://dynamo-release.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
423 stars 59 forks source link

Failure to generate vectorField: 'numpy.ndarray' object has no attribute 'setdiag' #263

Closed arnav-mehta closed 2 years ago

arnav-mehta commented 2 years ago

Hi all, I am having an issue with the dyn.vf.VectorField(adata_dyn_cc, basis='umap', M=1000, pot_curl_div=True) function. I get the following error when I set pot_curl_div=True, but I can generate a vector field when set to False. It appears that at some point along the way the function is expecting a sparse matrix and instead it is seeing a dense matrix.

Here is what ran and the error:

|-----> VectorField reconstruction begins... |-----> Retrieve X and V based on basis: UMAP. Vector field will be learned in the UMAP space. |-----> Generating high dimensional grids and convert into a row matrix. |-----> Learning vector field with method: sparsevfc. |-----> [SparseVFC] begins... |-----> Sampling control points based on data velocity magnitude... |-----> [SparseVFC] in progress: 100.0000% |-----> [SparseVFC] finished [57.3388s] |-----------> current cosine correlation between input velocities and learned velocities is less than 0.6. Make a 1-th vector field reconstruction trial. |-----> [SparseVFC] begins... |-----> Sampling control points based on data velocity magnitude... |-----> [SparseVFC] in progress: 100.0000% |-----> [SparseVFC] finished [51.3000s] |-----------> current cosine correlation between input velocities and learned velocities is less than 0.6. Make a 2-th vector field reconstruction trial. |-----> [SparseVFC] begins... |-----> Sampling control points based on data velocity magnitude... |-----> [SparseVFC] in progress: 100.0000% |-----> [SparseVFC] finished [54.6260s] |-----------> current cosine correlation between input velocities and learned velocities is less than 0.6. Make a 3-th vector field reconstruction trial. |-----> [SparseVFC] begins... |-----> Sampling control points based on data velocity magnitude... |-----> [SparseVFC] in progress: 100.0000% |-----> [SparseVFC] finished [54.9873s] |-----------> current cosine correlation between input velocities and learned velocities is less than 0.6. Make a 4-th vector field reconstruction trial. |-----> [SparseVFC] begins... |-----> Sampling control points based on data velocity magnitude... |-----> [SparseVFC] in progress: 100.0000% |-----> [SparseVFC] finished [41.5243s] |-----------> current cosine correlation between input velocities and learned velocities is less than 0.6. Make a 5-th vector field reconstruction trial. |-----? Cosine correlation between input velocities and learned velocities is less than 0.6 after 5 trials of vector field reconstruction. |-----> velocity_umap_SparseVFC to obsm in AnnData Object. |-----> X_umap_SparseVFC to obsm in AnnData Object. |-----> VecFld_umap to uns in AnnData Object. |-----> Running ddhodge to estimate vector field based pseudotime in umap basis...


AttributeError Traceback (most recent call last) /tmp/ipykernel_23302/4144987862.py in 1 # you can set verbose = 1/2/3 to obtain different levels of running information of vector field reconstruction ----> 2 dyn.vf.VectorField(adata_dyn_cc, basis='umap', M=1000, pot_curl_div=True)

~/notebooks/packages/dynamo/vectorfield/topography.py in VectorField(adata, basis, layer, dims, genes, normalize, grid_velocity, grid_num, velocity_key, method, min_vel_corr, restart_num, restart_seed, model_buffer_path, return_vf_object, map_topography, pot_curl_div, cores, result_key, copy, **kwargs) 871 logger.info(f"Running ddhodge to estimate vector field based pseudotime in {basis} basis...") 872 --> 873 ddhodge(adata, basis=basis, cores=cores) 874 if X.shape[1] == 2: 875 logger.info("Computing curl...")

~/notebooks/packages/dynamo/external/hodge.py in ddhodge(adata, X_data, layer, basis, n, VecFld, adjmethod, distance_free, n_downsamples, up_sampling, sampling_method, seed, enforce, cores, **kwargs) 174 # if not all cells are used in the graphize_vecfld function, set diagnoal to be 1 175 if len(np.unique(np.hstack(adj_mat.nonzero()))) != adata.n_obs: --> 176 adj_mat.setdiag(1) 177 178 # g = build_graph(adj_mat)

AttributeError: 'numpy.ndarray' object has no attribute 'setdiag'


The lines of code I used to generate this are as follows -- I tried to replace my adata.X to a sparse matrix with csr_matrix with no luck. Note I get the same issue with adata_dyn and adata_dyn_cc below, so I don't think it is an issue with cell cycle regression. Also when I check adata_dyn_cc.obsp['pearson_transition_matrix'] it is a sparse matrix.

object without cell cycle regression

adata_dyn = sc.read_h5ad([unprocessed matrix with unspliced/spliced]) dyn.pp.recipe_monocle(adata_dyn) adata_dyn dyn.tl.dynamics(adata_dyn, model='stochastic', cores=3) dyn.tl.hdbscan(adata_dyn) sce.pp.harmony_integrate(adata_dyn, 'sample', max_iter_harmony = 20) adata_dyn.obsm['X_pca'] = adata_dyn.obsm['X_pca_harmony'] dyn.tl.reduceDimension(adata_dyn, enforce=True) dyn.tl.neighbors(adata_dyn) dyn.tl.louvain(adata_dyn) dyn.tl.cell_velocities(adata_dyn, method='pearson', other_kernels_dict={'transform': 'sqrt'}) dyn.tl.cell_wise_confidence(adata_dyn) dyn.vf.VectorField(adata_dyn_cc, basis='umap', M=200, pot_curl_div=True) ## --> ERROR ABOVE dyn.cleanup(adata_dyn)

object with cell cycle regression

adata_dyn_cc = adata_dyn.copy() adata_dyn_cc.obs['S_score'] = adata_dyn_cc.obsm['cell_cycle_scores']['S'] adata_dyn_cc.obs['G2M_score'] = adata_dyn_cc.obsm['cell_cycle_scores']['G2-M'] sc.pp.regress_out(adata_dyn_cc, ['S_score', 'G2M_score']) dyn.pp.pca(adata_dyn_cc) dyn.tl.dynamics(adata_dyn_cc, model='stochastic', cores=3) dyn.tl.hdbscan(adata_dyn_cc) sce.pp.harmony_integrate(adata_dyn_cc, 'sample', max_iter_harmony = 20) adata_dyn_cc.obsm['X_pca'] = adata_dyn_cc.obsm['X_pca_harmony'] dyn.tl.reduceDimension(adata_dyn_cc, enforce=True) dyn.tl.cell_velocities(adata_dyn_cc, method='pearson', other_kernels_dict={'transform': 'sqrt'}) dyn.tl.cell_wise_confidence(adata_dyn_cc) dyn.pp.pca(adata_dyn_cc) dyn.vf.VectorField(adata_dyn_cc, basis='umap', M=200, pot_curl_div=True) ## --> ERROR ABOVE

Roger-GOAT commented 2 years ago

I get the same issue. @Xiaojieqiu could you mind giving some tips? Thanks a lot!

dummyindex commented 2 years ago

Hi @arnav-mehta and @Roger-GOAT , Thanks for analyzing data with dynamo package. You are correct regarding it is an issue coming from setdiag. I did not manage to reproduce the issue mentioned in @arnav-mehta 's code with zebrafish sample data as input. Can you share the dataset you use (if it can be shared)? I noticed I cannot reproduce your issue because the line setdiag is not triggered because of the if condition above it.

Meanwhile, a quick fix is in #269 . This shoud fix the setdiag problem once it is merged. Once it is merged, you can install it via

pip uninstall dynamo-release # uninstall the current version
pip install git+https://github.com/aristoteleo/dynamo-release.git # install the latest version on github default branch
arnav-mehta commented 2 years ago

Great thanks so much! Will try the quick fix. My data is not public unfortunately but happy to share it with you privately so you can see if you'd be able to recreate the error? Let em know where best to send


Arnav Mehta, M.D., Ph.D. Postdoctoral Associate, Hacohen and Lander labs Broad Institute of MIT and Harvard Clinical Fellow in Hematology/Oncology Dana-Farber / Partners CancerCare (954) 494 1989 | @.***

On Sun, Feb 6, 2022 at 6:24 PM user_name @.***> wrote:

Hi @arnav-mehta https://github.com/arnav-mehta and @Roger-GOAT https://github.com/Roger-GOAT , Thanks for analyzing data with dynamo package. I think you are correct regarding it is an issue coming from adj_mat is not sparse during ddhodge. I did not manage to reproduce the issue mentioned in @arnav-mehta https://github.com/arnav-mehta 's code with zebrafish sample data as input. Can you share the dataset you use (if it can be shared)? I noticed I cannot reproduce your issue because the line setdiag is not triggered because of the if condition.

Meanwhile, a quick fix is in #269 https://github.com/aristoteleo/dynamo-release/pull/269 . This shoud fix your problem once it is merged. Once it is merged, you can install it via

pip uninstall dynamo-release # uninstall the current version pip install git+https://github.com/aristoteleo/dynamo-release.git # install the latest version on github default branch

— Reply to this email directly, view it on GitHub https://github.com/aristoteleo/dynamo-release/issues/263#issuecomment-1030939250, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADYUAYBNNVX4N4PE6WL463TUZ37MDANCNFSM5MPMWFKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

dummyindex commented 2 years ago

Great thanks so much! Will try the quick fix. My data is not public unfortunately but happy to share it with you privately so you can see if you'd be able to recreate the error? Let em know where best to send ____ Arnav Mehta, M.D., Ph.D. Postdoctoral Associate, Hacohen and Lander labs Broad Institute of MIT and Harvard Clinical Fellow in Hematology/Oncology Dana-Farber / Partners CancerCare (954) 494 1989 | @. On Sun, Feb 6, 2022 at 6:24 PM user_name @.> wrote: Hi @arnav-mehta https://github.com/arnav-mehta and @Roger-GOAT https://github.com/Roger-GOAT , Thanks for analyzing data with dynamo package. I think you are correct regarding it is an issue coming from adj_mat is not sparse during ddhodge. I did not manage to reproduce the issue mentioned in @arnav-mehta https://github.com/arnav-mehta 's code with zebrafish sample data as input. Can you share the dataset you use (if it can be shared)? I noticed I cannot reproduce your issue because the line setdiag is not triggered because of the if condition. Meanwhile, a quick fix is in #269 <#269> . This shoud fix your problem once it is merged. Once it is merged, you can install it via pip uninstall dynamo-release # uninstall the current version pip install git+https://github.com/aristoteleo/dynamo-release.git # install the latest version on github default branch — Reply to this email directly, view it on GitHub <#263 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADYUAYBNNVX4N4PE6WL463TUZ37MDANCNFSM5MPMWFKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>

Hi @arnav-mehta and @Roger-GOAT ,

Does the recent fix on github master branch fix your issues above?

Regards, Dynamo team

Roger-GOAT commented 2 years ago

@dummyindex Not yet. It happened at dyn.vf.VectorField(adata, basis='umap', M=1000, pot_curl_div=True) but dyn.vf.VectorField(adata, basis='umap', M=1000, pot_curl_div=False) is good.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days