aertslab / SCENIC

SCENIC is an R package to infer Gene Regulatory Networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
394 stars 94 forks source link

unable to find an inherited method for function ‘GENIE3’ for signature ‘"dgeMatrix"’ #228

Closed ML1990-Lab closed 2 years ago

ML1990-Lab commented 2 years ago

Hi All,

I have performed all the steps previous to runGenie3() according to the main tutorial on R Studio on a dataset derived from a Seurat Object that was converted into a SingleCellExperiment.

The Error is the following: runGenie3(exprMat_filtered_log, scenicOptions) Using 740 TFs as potential regulators... Running GENIE3 part 1 Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘GENIE3’ for signature ‘"dgeMatrix"’

I've tried to re-install GENIE3 but nothing changed. I appreciate any suggestions you might have!

here is the code: `suppressPackageStartupMessages({ library(SCENIC) library(AUCell) library(RcisTarget) library(SCopeLoomR) library(loomR) library(KernSmooth) library(BiocParallel) library(ggplot2) library(data.table) library(grid) library(ComplexHeatmap) library(Seurat) library(SeuratWrappers) library(SeuratObject) library(SingleCellExperiment) })

updated geneFiltering function as in https://github.com/aertslab/SCENIC/issues/191

geneFiltering_new <- function(exprMat, scenicOptions, minCountsPerGene=3.01ncol(exprMat), minSamples=ncol(exprMat)*.01) {

Load options: outFile_genesKept and dbFilePath

outFile_genesKept <- NULL dbFilePath <- NULL if(class(scenicOptions) == "ScenicOptions") { dbFilePath <- getDatabases(scenicOptions)[[1]] outFile_genesKept <- getIntName(scenicOptions, "genesKept") }else{ dbFilePath <- scenicOptions[["dbFilePath"]] outFile_genesKept <- scenicOptions[["outFile_genesKept"]] } if(is.null(dbFilePath)) stop("dbFilePath")

Check expression matrix (e.g. not factor)

if(is.data.frame(exprMat)) { supportedClasses <- paste(gsub("AUCell_buildRankings,", "", methods("AUCell_buildRankings")), collapse=", ") supportedClasses <- gsub("-method", "", supportedClasses)

stop("'exprMat' should be one of the following classes: ", supportedClasses, 
     "(data.frames are not supported. Please, convert the expression matrix to one of these classes.)")

} if(any(table(rownames(exprMat))>1)) stop("The rownames (gene id/name) in the expression matrix should be unique.")

Calculate stats

nCountsPerGene <- Matrix::rowSums(exprMat, na.rm = T) nCellsPerGene <- Matrix::rowSums(exprMat>0, na.rm = T)

Show info

message("Maximum value in the expression matrix: ", max(exprMat, na.rm=T)) message("Ratio of detected vs non-detected: ", signif(sum(exprMat>0, na.rm=T) / sum(exprMat==0, na.rm=T), 2)) message("Number of counts (in the dataset units) per gene:") print(summary(nCountsPerGene)) message("Number of cells in which each gene is detected:") print(summary(nCellsPerGene))

Filter

message("\nNumber of genes left after applying the following filters (sequential):")

First filter

minCountsPerGene <- 3.01ncol(exprMat)

genesLeft_minReads <- names(nCountsPerGene)[which(nCountsPerGene > minCountsPerGene)] message("\t", length(genesLeft_minReads), "\tgenes with counts per gene > ", minCountsPerGene)

Second filter

minSamples <- ncol(exprMat)*.01

nCellsPerGene2 <- nCellsPerGene[genesLeft_minReads] genesLeft_minCells <- names(nCellsPerGene2)[which(nCellsPerGene2 > minSamples)] message("\t", length(genesLeft_minCells), "\tgenes detected in more than ",minSamples," cells")

Exclude genes missing from database:

library(RcisTarget) motifRankings <- importRankings(dbFilePath) # either one, they should have the same genes genesInDatabase <- colnames(getRanking(motifRankings))

genesLeft_minCells_inDatabases <- genesLeft_minCells[which(genesLeft_minCells %in% genesInDatabase)] message("\t", length(genesLeft_minCells_inDatabases), "\tgenes available in RcisTarget database")

genesKept <- genesLeft_minCells_inDatabases if(!is.null(outFile_genesKept)){ saveRDS(genesKept, file=outFile_genesKept) if(getSettings(scenicOptions, "verbose")) message("Gene list saved in ", outFile_genesKept) } return(genesKept) }

setwd("..../SCENIC/") immune.combined <- readRDS("~/.....immune.combined_10.rds") DefaultAssay(immune.combined) <- "RNA"

set.seed(123)

sce <- as.SingleCellExperiment(immune.combined) exprMat <- counts(sce) cellInfo <- colData(sce)

head(cellInfo) cellInfo <- data.frame(cellInfo) cbind(table(cellInfo$Cell.Identity)) dir.create("int") saveRDS(cellInfo, file="int/cellInfo.Rds")

Color to assign to the variables (same format as for NMF::aheatmap)

colVars <- list(Naming=c("ILC1_a"="brown1", "ILC1_b"="brown4", "ILC1_c"="darkviolet", "ILC1_d"="hotpink", "ILC3"="deeppink3", "NK_a"="blue", "NK_b" = "azure4", "NK_c" = "chocolate1", "NK_d" = "aquamarine", "NK_e" = "chartreuse")) colVars$Naming <- colVars$Naming[intersect(names(colVars$Naming), cellInfo$Cell.Identity)] saveRDS(colVars, file="int/colVars.Rds") plot.new(); legend(0,1, fill=colVars$Naming, legend=names(colVars$Naming), horiz = F)

Initialize settings

scenicOptions <- initializeScenic(org="mgi", dbDir="/home/mattia/Desktop/scRNA/SCENIC/cisTarget_databases", datasetTitle= "Sciume_colon", nCores=6) scenicOptions@inputDatasetInfo$cellInfo <- "int/cellInfo.Rds" scenicOptions@inputDatasetInfo$colVars <- "int/colVars.Rds"

Save to use at a later time...

saveRDS(scenicOptions, file="int/scenicOptions.Rds")

Co-expression network

genesKept <- geneFiltering_new(exprMat, scenicOptions=scenicOptions, minCountsPerGene=3.01ncol(exprMat), minSamples=ncol(exprMat)*.01) exprMat_filtered <- exprMat[genesKept, ] runCorrelation(as.matrix(exprMat_filtered), scenicOptions) exprMat_filtered_log <- log2(exprMat_filtered+1) runGenie3(exprMat_filtered_log, scenicOptions) ` sessionInfo() R version 4.1.2 (2021-11-01) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.3 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=it_IT.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=it_IT.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=it_IT.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 grid stats graphics grDevices utils datasets methods base

other attached packages: [1] SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0 Biobase_2.54.0 GenomicRanges_1.46.1
[5] GenomeInfoDb_1.30.0 IRanges_2.28.0 S4Vectors_0.32.3 BiocGenerics_0.40.0
[9] MatrixGenerics_1.6.0 matrixStats_0.61.0 SeuratWrappers_0.3.0 SeuratObject_4.0.4
[13] Seurat_4.1.0 ComplexHeatmap_2.10.0 data.table_1.14.2 ggplot2_3.3.5
[17] BiocParallel_1.28.3 KernSmooth_2.23-20 loomR_0.2.1.9000 hdf5r_1.3.5
[21] R6_2.5.1 SCopeLoomR_0.13.0 RcisTarget_1.14.0 AUCell_1.16.0
[25] SCENIC_1.2.4

loaded via a namespace (and not attached): [1] circlize_0.4.13 plyr_1.8.6 igraph_1.2.11 lazyeval_0.2.2 GSEABase_1.56.0
[6] splines_4.1.2 GENIE3_1.16.0 listenv_0.8.0 scattermore_0.7 digest_0.6.29
[11] foreach_1.5.1 htmltools_0.5.2 fansi_1.0.2 magrittr_2.0.1 memoise_2.0.1
[16] tensor_1.5 cluster_2.1.2 doParallel_1.0.16 ROCR_1.0-11 remotes_2.4.2
[21] globals_0.14.0 Biostrings_2.62.0 annotate_1.72.0 R.utils_2.11.0 spatstat.sparse_2.1-0 [26] colorspace_2.0-2 blob_1.2.2 ggrepel_0.9.1 dplyr_1.0.7 crayon_1.4.2
[31] RCurl_1.98-1.5 jsonlite_1.7.3 graph_1.72.0 spatstat.data_2.1-2 survival_3.2-13
[36] zoo_1.8-9 iterators_1.0.13 glue_1.6.0 polyclip_1.10-0 gtable_0.3.0
[41] zlibbioc_1.40.0 XVector_0.34.0 leiden_0.3.9 GetoptLong_1.0.5 DelayedArray_0.20.0
[46] future.apply_1.8.1 shape_1.4.6 abind_1.4-5 scales_1.1.1 DBI_1.1.2
[51] miniUI_0.1.1.1 Rcpp_1.0.8 viridisLite_0.4.0 xtable_1.8-4 clue_0.3-60
[56] spatstat.core_2.3-2 reticulate_1.23 rsvd_1.0.5 bit_4.0.4 htmlwidgets_1.5.4
[61] httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.2 ica_1.0-2 pkgconfig_2.0.3
[66] XML_3.99-0.8 R.methodsS3_1.8.1 uwot_0.1.11 deldir_1.0-6 utf8_1.2.2
[71] tidyselect_1.1.1 rlang_0.4.12 reshape2_1.4.4 later_1.3.0 AnnotationDbi_1.56.2
[76] munsell_0.5.0 tools_4.1.2 cachem_1.0.6 generics_0.1.1 RSQLite_2.2.9
[81] ggridges_0.5.3 stringr_1.4.0 fastmap_1.1.0 goftest_1.2-3 bit64_4.0.5
[86] fitdistrplus_1.1-6 purrr_0.3.4 RANN_2.6.1 KEGGREST_1.34.0 nlme_3.1-155
[91] pbapply_1.5-0 future_1.23.0 mime_0.12 R.oo_1.24.0 arrow_6.0.1
[96] compiler_4.1.2 rstudioapi_0.13 plotly_4.10.0 png_0.1-7 spatstat.utils_2.3-0
[101] tibble_3.1.6 stringi_1.7.6 lattice_0.20-45 Matrix_1.4-0 vctrs_0.3.8
[106] pillar_1.6.4 lifecycle_1.0.1 BiocManager_1.30.16 spatstat.geom_2.3-1 lmtest_0.9-39
[111] GlobalOptions_0.1.2 RcppAnnoy_0.0.19 cowplot_1.1.1 bitops_1.0-7 irlba_2.3.5
[116] httpuv_1.6.5 patchwork_1.1.1 promises_1.2.0.1 gridExtra_2.3 parallelly_1.30.0
[121] codetools_0.2-18 MASS_7.3-55 assertthat_0.2.1 rjson_0.2.21 withr_2.4.3
[126] sctransform_0.3.3 GenomeInfoDbData_1.2.7 mgcv_1.8-38 parallel_4.1.2 rpart_4.1-15
[131] tidyr_1.1.4 Rtsne_0.15 shiny_1.7.1

2019surbhi commented 2 years ago

I think you have a sparse matrix and some of the functions in the workflow require regular matrix. If your dataset is not too large i.e. your (number of genes x number of cells) < 2147483647 then you can convert your expression matrix simply by exprMat=as.matrix(exprMat)

s-aibar commented 2 years ago

Indeed, GENIE3 does not support sparse matrices yet... To run SCENIC you will need to convert it to regular matrix with as.matrix()

note: This is certainly not very efficient, but at the moment we are focusing our optimization efforts on the pyton implementation...