aertslab / cisTopic

cisTopic: Probabilistic modelling of cis-regulatory topics from single cell epigenomics data
135 stars 29 forks source link

topicsRcisTarget error #25

Closed jk86754 closed 5 years ago

jk86754 commented 5 years ago

Hi,

i ran into the following issue:

library(feather) cisTopicObject_d0 <- topicsRcisTarget(cisTopicObject_d0, genome='mm9', pathToFeather, reduced_database=FALSE, nesThreshold=3, rocthr=0.005, maxRank=20000, nCores=24) [1] "Exporting data to clusters..." Error in checkForRemoteErrors(lapply(cl, recvResult)) : 24 nodes produced errors; first error: package or namespace load failed for ‘RcisTarget’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘feather’

as you see, the 'feather' package loads without issues. However, topicsRcisTarget produces this error. I vaguely recall a conversation with authors of the 'scenic' package on feather v0.3.3 (which is what i have installed) not being compatible and on having to roll it back to v0.3.1 as far as i remember. Is this the case with cisTopic too?

Thank you!

Joe

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRblas.so LAPACK: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRlapack.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 grid stats graphics grDevices utils datasets methods base

other attached packages: [1] rtracklayer_1.40.6 R.utils_2.9.0 R.oo_1.22.0 R.methodsS3_1.7.1
[5] Seurat_3.0.2 ggplot2_3.2.0 RcisTarget_1.5.0 feather_0.3.3
[9] cisTopic_0.2.1 BiocParallel_1.14.2 doParallel_1.0.15 iterators_1.0.12
[13] foreach_1.4.7 densityClust_0.3 org.Mm.eg.db_3.7.0 TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.4 [17] GenomicFeatures_1.34.8 AnnotationDbi_1.44.0 Biobase_2.42.0 ChIPseeker_1.18.0
[21] rGREAT_1.14.0 GenomicRanges_1.34.0 GenomeInfoDb_1.18.2 IRanges_2.16.0
[25] S4Vectors_0.20.1 BiocGenerics_0.28.0 data.table_1.12.2 fastcluster_1.1.25
[29] ComplexHeatmap_1.20.0 Rtsne_0.15 umap_0.2.2.0 Rsubread_1.32.4
[33] httpuv_1.5.1

loaded via a namespace (and not attached): [1] reticulate_1.10 tidyselect_0.2.5 htmlwidgets_1.3 RSQLite_2.1.1
[5] munsell_0.5.0 codetools_0.2-16 ica_1.0-2 DT_0.8
[9] future_1.14.0 withr_2.1.2 colorspace_1.4-1 GOSemSim_2.8.0
[13] rstudioapi_0.10 ROCR_1.0-7 DOSE_3.8.2 gbRd_0.4-11
[17] listenv_0.7.0 Rdpack_0.11-0 urltools_1.7.3 GenomeInfoDbData_1.2.0
[21] polyclip_1.10-0 bit64_0.9-7 farver_1.1.0 vctrs_0.2.0
[25] R6_2.4.0 rsvd_1.0.2 bitops_1.0-6 fgsea_1.8.0
[29] gridGraphics_0.4-1 DelayedArray_0.8.0 assertthat_0.2.1 promises_1.0.1
[33] SDMTools_1.1-221.1 scales_1.0.0 ggraph_1.0.2 enrichplot_1.2.0
[37] gtable_0.3.0 npsurv_0.4-0 globals_0.12.4 rlang_0.4.0
[41] zeallot_0.1.0 GlobalOptions_0.1.0 splines_3.5.1 lazyeval_0.2.2
[45] europepmc_0.3 yaml_2.2.0 reshape2_1.4.3 backports_1.1.4
[49] qvalue_2.14.1 tools_3.5.1 ggplotify_0.0.4 gridBase_0.4-7
[53] gplots_3.0.1.1 RColorBrewer_1.1-2 ggridges_0.5.1 Rcpp_1.0.1
[57] plyr_1.8.4 progress_1.2.2 zlibbioc_1.28.0 purrr_0.3.2
[61] RCurl_1.95-4.12 prettyunits_1.0.2 pbapply_1.4-1 GetoptLong_0.1.7
[65] viridis_0.5.1 cowplot_1.0.0 zoo_1.8-6 SummarizedExperiment_1.10.1
[69] ggrepel_0.8.1 cluster_2.1.0 magrittr_1.5 DO.db_2.9
[73] circlize_0.4.6 triebeard_0.3.0 lmtest_0.9-37 RANN_2.6
[77] fitdistrplus_1.0-14 matrixStats_0.54.0 hms_0.5.0 lsei_1.2-0
[81] mime_0.7 xtable_1.8-4 XML_3.98-1.20 AUCell_1.7.1
[85] gridExtra_2.3 shape_1.4.4 compiler_3.5.1 biomaRt_2.36.1
[89] tibble_2.1.3 KernSmooth_2.23-15 crayon_1.3.4 htmltools_0.3.6
[93] later_0.8.0 snow_0.4-3 tidyr_0.8.2 DBI_1.0.0
[97] tweenr_1.0.1 MASS_7.3-51.4 boot_1.3-23 Matrix_1.2-17
[101] gdata_2.18.0 metap_1.1 igraph_1.2.2 pkgconfig_2.0.2
[105] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 rvcheck_0.1.3 GenomicAlignments_1.18.1 plotly_4.9.0
[109] xml2_1.2.0 annotate_1.60.1 lda_1.4.2 XVector_0.22.0
[113] bibtex_0.4.2 stringr_1.4.0 digest_0.6.19 tsne_0.1-3
[117] sctransform_0.2.0 graph_1.60.0 Biostrings_2.48.0 fastmatch_1.1-0
[121] GSEABase_1.42.0 shiny_1.3.2 Rsamtools_1.32.3 gtools_3.8.1
[125] rjson_0.2.20 nlme_3.1-141 jsonlite_1.6 viridisLite_0.3.0
[129] pillar_1.4.2 lattice_0.20-38 httr_1.4.1 plotrix_3.7-6
[133] survival_2.44-1.1 GO.db_3.6.0 glue_1.3.1 FNN_1.1.2.1
[137] UpSetR_1.4.0 png_0.1-7 bit_1.1-14 ggforce_0.2.2
[141] stringi_1.4.3 blob_1.2.0 doSNOW_1.0.18 caTools_1.17.1.1
[145] memoise_1.1.0 dplyr_0.8.1 irlba_2.3.3 future.apply_1.3.0
[149] ape_5.2

cbravo93 commented 5 years ago

Hi!

I have tried it with feather 0.3.3 and RcisTarget (from github) 1.5.0 and it seems to work. Can you check if it works for you with this:

topicsRcisTarget <- function(
  object,
  genome,
  pathToDb,
  reduced_database = FALSE,
  nesThreshold = 3,
  rocthr = 0.005,
  maxRank = 20000,
  nCores = 1,
  ...
){
  # Check input
  if(length(object@binarized.regions.to.Rct) < 1){
    stop('Please, run binarizedcisTopicsToCtx() first.')
  }

  # Check dependencies
  if(! "RcisTarget" %in% installed.packages()){
    stop('Please, install RcisTaregt: \n install_github("aertslab/RcisTarget")')
  } else {
    require(RcisTarget)
  }

  if (genome == 'hg19'){
    data(motifAnnotations_hgnc)
    motifAnnot <- motifAnnotations_hgnc
    if(rocthr!=0.005 | maxRank!=20000){
      warning("For Homo sapiens the recommended settings are: rocthr=0.005, maxRank=20000")
    } 
  }
  else if (genome == 'mm9'){
    data(motifAnnotations_mgi)
    motifAnnot <- motifAnnotations_mgi
    if(rocthr!=0.005 | maxRank!=20000){
      warning("For Mus musculus the recommended settings are: rocthr=0.005, maxRank=20000")
    } 
  }
  else if (genome == 'dm3'){
    data(motifAnnotations_dmel)
    motifAnnot <- motifAnnotations_dmel
    if(rocthr!=0.01 | maxRank!=5000){
      warning("For Drosophila melanogaster the recommended settings are: rocthr=0.01, maxRank=5000")
    } 
  }
  else if (genome == 'dm6'){
    data(motifAnnotations_dmel)
    motifAnnot <- motifAnnotations_dmel
    if(rocthr!=0.01 | maxRank!=5000){
      warning("For Drosophila melanogaster the recommended settings are: rocthr=0.01, maxRank=5000")
    } 
  } else {
    stop('The genome required is not available! Try using the liftover option.')
  }

  topicsList <- object@binarized.regions.to.Rct

  extension <- strsplit(pathToDb, "\\.")[[1]][length(strsplit(pathToDb, "\\.")[[1]])]
  if (extension == 'feather'){
    columnsinRanking <- feather::feather_metadata(pathToDb)[["dim"]][2]-1
  }
  else if (extension == "parquet"){
    pq <- arrow::parquet_file_reader(pathToDb)
    columnsinRanking <- pq$GetSchema()$num_fields()-1
  }
  else{
    stop("Database format must be feather or parquet.")
  }

  if (reduced_database == FALSE){
    ctxreg <- unique(as.vector(unlist(object@binarized.regions.to.Rct)))
    motifRankings <- importRankings(pathToDb, columns = c('features', ctxreg))
  }
  else{
    motifRankings <- importRankings(pathToDb)
    ctxregions <- colnames(getRanking(motifRankings))[-1]
    topicsList <- llply(1:length(topicsList), function(i) topicsList[[i]][which(topicsList[[i]] %in% ctxregions)])
    names(topicsList) <- names(object@binarized.regions.to.Rct)
  }

  if (length(topicsList) < nCores){
    print(paste('The number of cores (', nCores, ') is higher than the number of topics (', length(topicsList),').', sep=''))
  }

  if(nCores > 1){
    cl <- makeCluster(nCores, type = "SOCK")
    registerDoSNOW(cl)
    print(paste('Exporting data to clusters...'))
    clusterEvalQ(cl, library(feather), library(RcisTarget))
    clusterExport(cl, c("topicsList", "motifRankings", "motifAnnot", "nesThreshold", "rocthr", "columnsinRanking", "maxRank"), envir=environment())
    print(paste('Running RcisTarget...'))
    cisTopic.cisTarget <- suppressWarnings(llply(1:length(topicsList), function (i) cisTarget(topicsList[[i]],
                                    motifRankings,
                                    motifAnnot = motifAnnot,
                                    nesThreshold = nesThreshold,
                                    aucMaxRank = rocthr * columnsinRanking,
                                    geneErnMmaxRank = maxRank,
                                    nCores=1
    ), .parallel = TRUE))
    stopCluster(cl)
  }
  else{
    cisTopic.cisTarget <- suppressWarnings(llply(1:length(topicsList), function (i) cisTarget(topicsList[[i]],
                                                                                             motifRankings,
                                                                                             motifAnnot = motifAnnot,
                                                                                             nesThreshold = nesThreshold,
                                                                                             aucMaxRank = rocthr * columnsinRanking,
                                                                                             geneErnMmaxRank = maxRank,
                                                                                             nCores=1
    )))
  }

  object.binarized.RcisTarget <- list()

  for (i in 1:length(cisTopic.cisTarget)){
    if(nrow(cisTopic.cisTarget[[i]]) > 0){
      colnames(cisTopic.cisTarget[[i]])[c(1, 7, 9)] <- c('cisTopic', 'nEnrRegions', 'enrichedRegions')
      cisTopic.cisTarget[[i]]$cisTopic <- rep(paste('Topic', i, sep='_'), nrow(cisTopic.cisTarget[[i]]))
      object.binarized.RcisTarget[[i]] <- addLogo(cisTopic.cisTarget[[i]])
    } else {
      cisTopic.cisTarget[[i]] <- NULL
    }

  }

  object@binarized.RcisTarget <- object.binarized.RcisTarget
  return(object)
}

I would also lower the number of cores a bit, because it will load the feather database in each of them (and that can take up a lot of memory).

Let me know if this solves it!

C

jk86754 commented 5 years ago

Unfortunately, same issue. I am wondering if i am having a docker-related issue. My RStudio is running on a docker container and i am starting to think this is causing the issue. I can run the command with ncores=1 (although it does take forever); the error comes up with ncores>1. As far as i recall, i had to do the same thing for SCENIC.