NathanSkene / EWCE

Expression Weighted Celltype Enrichment. See the package website for up-to-date instructions on usage.
https://nathanskene.github.io/EWCE/index.html
53 stars 25 forks source link

bootstrap_enrichment_test throws "Error in exp_mats[[cc]][s, ] <- sort(expD[, cc]) : replacement has length zero" #86

Open Kfarls opened 1 year ago

Kfarls commented 1 year ago

1. Bug description

When running EWCE with my own gene lists and single cell data, I get this error: "Error in exp_mats[[cc]][s, ] <- sort(expD[, cc]) : replacement has length zero" traceback is:

  1. | generate_bootstrap_plots_exp_mats(exp_mats = exp_mats, sct_data = sct_data, annotLevel = annotLevel, bootstrap_list = bootstrap_list, reps = reps, combinedGenes = combinedGenes, hits = hits, verbose = verbose)

  2. | compute_gene_scores(sct_data = sct_data, annotLevel = annotLevel, bootstrap_list = bootstrap_list, hits = hits, combinedGenes = combinedGenes, verbose = verbose)

  3. | get_summed_proportions(hits = hits, sct_data = sct_data, annotLevel = annotLevel, reps = reps, no_cores = no_cores, geneSizeControl = geneSizeControl, controlledCT = controlledCT, control_network = control_network, verbose = verbose)

  4. EWCE::bootstrap_enrichment_test(sct_data = ctd, sctSpecies = "human", genelistSpecies = "human", hits = hits, reps = reps, annotLevel = annotLevel)

2. Reproducible example

Code

reps <- 100
# Use level 1 annotations)
annotLevel <- 1
hits <- GWAS_psych

psych_results <- EWCE::bootstrap_enrichment_test(sct_data = ctd,
                                                hits = hits, 
                                                sctSpecies_origin = "human",
                                                sctSpecies = "human",
                                                genelistSpecies = "human",
                                                reps = reps,
                                                annotLevel = annotLevel,
                                                geneSizeControl = TRUE,
                                                verbose = TRUE)

Data

(If possible, upload a small sample of your data so that we can reproduce the bug on our end. If that's not possible, please at least include a screenshot of your data and other relevant details.) I have tried running EWCE with several of my own gene lists, the gene lists work fine when running with the example ctds however i get the above error if I run on my own ctd. str(GWAS_neurol) chr [1:45] "ANO3" "APOE" "DYNC1I1" "FHL5" "GALNT2" "PVRL2" "SCN2A" "ADSS" "ASH1L" "ASTN2" "ATG13" "BCL7C" "BTN2A2" "C14orf37" "CAMK1D" ... ctd generated like so: counts <- GetAssayData(seurat, slot = "counts") temp <- list() # initalise empty list celltype <- unique(metadata$major_clust) temp <- sapply(1:length(celltype), function(x){ cellname <- celltype[x] keep <- which(metadata$major_clust == cellname) counts_subset <- counts[, keep] keep <- which(tabulate(counts_subset@i + 1) > (0.05 * ncol(counts_subset))) genes <- rownames(counts_subset)[keep] updatedlist <- append(temp, list(genes)) }) a <- bitr(geneID = rownames(counts.filtered), fromType = "ENTREZID", toType = "SYMBOL", OrgDb = "org.Hs.eg.db", drop = FALSE) %>% mutate(SYMBOL = if_else(is.na(SYMBOL), as.character(ENTREZID), SYMBOL)) rownames(counts.filtered) <- a$SYMBOL ctd <- generate_celltype_data(exp = counts.filtered, annotLevels = annotLevels, groupName = "pfc_dev", input_species = "human", output_species = "human", savePath = "../results/single_cell/EWCE/") ctd <- load_rdata("../results/single_cell/EWCE/ctd_pfc_dev.rda")

3. Session info

R version 4.3.1 (2023-06-16) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Perth tzcode source: internal

attached base packages: [1] stats4 stats graphics grDevices datasets utils methods base

other attached packages: [1] devtools_2.4.5 usethis_2.2.2 ewceData_1.8.0 ExperimentHub_2.8.1 AnnotationHub_3.8.0
[6] BiocFileCache_2.8.0 dbplyr_2.3.4 ggdendro_0.1.23 tidyverse_2.0.0 hdWGCNA_0.2.23
[11] EWCE_1.9.3 RNOmni_1.0.1.2 dplyr_1.1.3 tibble_3.2.1 tidyr_1.3.0
[16] readr_2.1.4 purrr_1.0.2 stringr_1.5.0 lubridate_1.9.2 forcats_1.0.0
[21] reshape2_1.4.4 ggplot2_3.4.2 patchwork_1.1.3 gplots_3.1.3 clusterProfiler_4.8.1 [26] rtracklayer_1.60.0 GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 SeuratObject_4.1.3 Seurat_4.3.0.1
[31] org.Hs.eg.db_3.17.0 AnnotationDbi_1.62.2 IRanges_2.34.1 S4Vectors_0.38.1 Biobase_2.60.0
[36] BiocGenerics_0.46.0 TCseq_1.23.0 pacman_0.5.1

Al-Murphy commented 1 year ago

Hey, would it be possible to share your ctd? It would nbot be possible for us to debug otherwise.

Thanks, Alan.