error in find_gene_modules - Error in RANN #346

Closed lb15 closed 4 years ago

lb15 commented 4 years ago

If this is a question and not a bug report or enhancement request, please post to our google group at!forum/monocle-3-users

Describe the bug I'm running find_gene_modules to group genes changing over pseudotime into modules. I get an error in RANN::nn2(). I've tried with a full cds and a subsetted cds (subsetted on both cells and genes) but both produce the error. I've also tried specifying monocle3::find_gene_modules() but that is also producing the same error.

To Reproduce

[1] "Gm37381" "Rp1"     "Mrpl15"  "Tcea1"   "Rgs20"   "Atp6v1h"
Error in RANN::nn2(data, data, k + 1, searchtype = "standard") : 
  NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation
Timing stopped at: 0.002 0.001 0.002

traceback() After the error, run traceback() in R and post the output:

4: RANN::nn2(data, data, k + 1, searchtype = "standard")
3: system.time(tmp <- RANN::nn2(data, data, k + 1, searchtype = "standard"))
2: leiden_clustering(data = reduced_dim_res, pd = rowData(cds)[row.names(reduced_dim_res), 
       , drop = FALSE], k = k, weight = weight, num_iter = leiden_iter, 
       resolution_parameter = resolution, random_seed = random_seed, 
       verbose = verbose, ...)
1: find_gene_modules(cds_unbias[as.character(sig_genes), ])

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.


Additional context I'm using the development branch of monocle3. I recently upgraded to R 4.0.0 and the development version of monocle3. Prior to this I did not get the error in find_gene_modules. Thanks!!

lb15 commented 4 years ago

I tried to run find_gene_modules() with a cds and gene list created before I upgraded R/Monocle3, and it worked. So it may be something with how i have created this cds object. One difference is that the cds_unbias contains cells across partitions that I want to analyze. I upgraded to the development monocle3 because of the fix in #275 . In the older version of monocle3, I was manually replacing the partition identifiers with 1 for all cells, to create one partition.

Here is how I built my cds:

exprs<- GetAssayData(seur_ob, slot="counts",assay = "RNA")

## phenodata =

## feature data
genes <- data.frame(gene_short_name = rownames(exprs))
rownames(genes) <- rownames(exprs)

cds <- new_cell_data_set(exprs,
                         cell_metadata =,
                         gene_metadata = genes)

################### PROCESS DATA #########################
cds=preprocess_cds(cds, num_dim=20)
cds=align_cds(cds, preprocess_method="PCA",alignment_group="batch")
cds <- reduce_dimension(cds, umap.fast_sgd = FALSE,preprocess_method = "Aligned")
cds <- cluster_cells(cds,cluster_method="leiden",random_seed=123)
cds <- learn_graph(cds,use_partition = F ,learn_graph_control = list(minimal_branch_len=25))
cds_sub = cds[,colData(cds)$Merged_MCC_clusters == "MCC"]
##code to produce vector of genes expressed in at least 10% of cells in any cluster

### Regression analysis
gene_fits <- fit_models(cds_unbias, model_formula_str = "~pseudotime")
fit_coefs <- coefficient_table(gene_fits)
pseudotime_terms <- fit_coefs %>% filter(term == "pseudotime")
pseudotime_sig <- pseudotime_terms %>% filter (q_value < 0.05) %>%
        select(gene_short_name, term, q_value, estimate)
sig_genes = pseudotime_sig$gene_short_name
afaissa commented 4 years ago

Same problem here:

traceback() 4: RANN::nn2(data, data, k + 1, searchtype = "standard") 3: system.time(tmp <- RANN::nn2(data, data, k + 1, searchtype = "standard")) 2: leiden_clustering(data = reduced_dim_res, pd = rowData(cds)[row.names(reduced_dim_res), , drop = FALSE], k = k, weight = weight, num_iter = leiden_iter, resolution_parameter = resolution, random_seed = random_seed, verbose = verbose, ...) 1: find_gene_modules(cds[pr_deg_ids, ], resolution = 0.01)

brgew commented 4 years ago

Hi lb15,

I tried to reproduce the issue using the example on the Monocle3 documentation web site but it worked. Looking at the find_genes_modules() function call in your second message

sig_genes = pseudotime_sig$gene_short_name

it appears to me that you are subsetting the cds_unbias rows using the gene short names; however, I think that cds_unbias[as.character(sig_genes),] requires the gene ids rather than short names. I think that the call

modules<-find_gene_modules(cds[rowData(cds)$gene_short_name %in% sig_genes,])

may work for you. Can you let me know if the problem persists?

afaissa commented 4 years ago

Hi brgew,

I tried to apply the suggestion you gave to lb15 in my data, but still having the same issue. Is there anything else I can try?

Thank you!

gene_fits <- fit_models(cds, model_formula_str = "~pseudotime") fit_coefs <- coefficient_table(gene_fits) pseudotime_terms <- fit_coefs %>% filter(term == "pseudotime") pseudotime_sig <- pseudotime_terms %>% filter (q_value < 0.05) %>% select(gene_short_name, term, q_value, estimate) sig_genes = pseudotime_sig$gene_short_name modules<-find_gene_modules(cds[rowData(cds)$gene_short_name %in% sig_genes,])

Error in RANN::nn2(data, data, k + 1, searchtype = "standard") : NA/NaN/Inf in foreign function call (arg 1) Timing stopped at: 0.001 0 0.001

traceback() 4: RANN::nn2(data, data, k + 1, searchtype = "standard") 3: system.time(tmp <- RANN::nn2(data, data, k + 1, searchtype = "standard")) 2: leiden_clustering(data = reduced_dim_res, pd = rowData(cds)[row.names(reduced_dim_res), , drop = FALSE], k = k, weight = weight, louvain_iter = louvain_iter, resolution_parameter = resolution, random_seed = random_seed, verbose = verbose, ...) 1: find_gene_modules(cds[rowData(cds)$gene_short_name %in% sig_genes, ])

lb15 commented 4 years ago

Hi brgrew,

Thanks for your help. I tried your recommendation but I'm getting the same error.

I also checked the gene_short_name and think that the subsetting will give me an identical object.

> test=cds_unbias[sig_genes,]
> identical(test, cds_unbias[rowData(cds_unbias)$gene_short_name %in% sig_genes,])
[1] TRUE
> sum(!sig_genes %in% rowData(cds_unbias)$gene_short_name)
[1] 0

I have been able to solve this by replacing the partitions (I have 3 partitions) with 1.

## make everything one partition
cds@clusters$UMAP$partitions[cds@clusters$UMAP$partitions == "2"] <- "1" 
cds@clusters$UMAP$partitions[cds@clusters$UMAP$partitions == "3"] <- "1"

I then run learn_graph, order_cells, DE analysis, and then find_gene_modules worked.

afaissa commented 4 years ago

Hi brgrew,

I was not able to fix the issue with lb15' suggestion replacing the partition.

The error occurs when it is used the residual_model_formula_str with align_cds.

I was able to reproduce the issue using the example on the Monocle3 documentation web site.

library(monocle3) library(dplyr)

expression_matrix <- readRDS(url("")) cell_metadata <- readRDS(url("")) gene_annotation <- readRDS(url(""))

cds <- new_cell_data_set(expression_matrix, cell_metadata = cell_metadata, gene_metadata = gene_annotation)

cds <- preprocess_cds(cds, num_dim = 45) cds <- align_cds(cds, residual_model_formula_str = "~Size_Factor") cds <- reduce_dimension(cds)

cds = cluster_cells(cds, resolution=1e-5) levels(cds@clusters$UMAP$partitions) levels(cds@clusters$UMAP$clusters)

pr_graph_test_res <- graph_test(cds, neighbor_graph="knn", cores=18) pr_deg_ids <- row.names(subset(pr_graph_test_res, morans_I > 0.01 & q_value < 0.05)) gene_module_df <- find_gene_modules(cds[pr_deg_ids,], cores=18)

Error in RANN::nn2(data, data, k + 1, searchtype = "standard") : NA/NaN/Inf in foreign function call (arg 1) Timing stopped at: 0.001 0 0.001

brgew commented 4 years ago

Hi @afaissa, Thank you for posting the example. It is very helpful!

afaissa commented 4 years ago

Thank you for all the amazing work on the package. Please, let me know if I can try again. I am waiting for this to apply for 3 of my data sets. Please, let me know if I can help.

brgew commented 4 years ago

Hi @afaissa, I believe that this is fixed in the develop branch. Please let me know if you find otherwise. Thank you!

afaissa commented 4 years ago

Solved! Thank you very much!

brgew commented 4 years ago

I appreciate the feedback. Thank you!