GreenleafLab / ArchR

ArchR : Analysis of Regulatory Chromatin in R (www.ArchRProject.com)
MIT License
376 stars 133 forks source link

Error with Arrow ArchRProj #1491

Closed rustycandlewick closed 2 years ago

rustycandlewick commented 2 years ago

This is an issue template made by the developers of ArchR. You MUST follow these instructions.

Questions related to how to use ArchR or requests for new features should be posted in the Discussions forum (https://github.com/GreenleafLab/ArchR/discussions).

Before you submit this Bug Report please update ArchR to the latest stable version and make sure that this issue has not already been fixed in the latest release. ArchR is still in active development and we will fix problems as they arise. To update ArchR:

devtools::install_github("GreenleafLab/ArchR", ref="master", repos = BiocManager::repositories())

If your issue persists, then please submit this bug report.

PLEASE FILL OUT THE RELEVANT INFORMATION AND DELETE THE UNUSED PORTIONS OF THIS ISSUE TEMPLATE.

Attach your log file

ArrowFiles <- createArrowFiles(

  • inputFiles = '/home/rpulya1/data-sblacks1/Ritvik/SignacTutorial/10xMultiome/atac_fragments.tsv.gz',
  • minTSS = 4, #Dont set this too high because you can always increase later
  • minFrags = 1000,
  • sampleNames= '1',
  • addTileMat = TRUE,
  • addGeneScoreMat = TRUE,
  • geneAnnotation = getGeneAnnotation(geneAnnotation),
  • genomeAnnotation = getGenomeAnnotation(genomeAnnotation)
  • ) Error in .validInput(input = ArchRProj, name = "ArchRProj", valid = c("ArchRProject", : Input value for 'ArchRProj' is not a null,archrproject, (ArchRProj = SimpleList) please supply valid input!

Describe the bug

I am unable to create an Arrow file and am not sure what the problem is from above. If it helps, I only have access to R version 4.0.2 so I ended up having to use the refGene instead of ensGene.

To Reproduce

Works with the tutorial (this error is immediate with my dataset).

Expected behavior

I expected the arrow file to be created.

Session Info

R version 4.0.2 (2020-06-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 8 (Core)

Matrix products: default BLAS: /data/apps/linux-centos8-cascadelake/gcc-9.3.0/r-4.0.2-amdvcpog4ugspqwwx3ari7pzkmckelu6/rlib/R/lib/libRblas.so LAPACK: /data/apps/linux-centos8-cascadelake/gcc-9.3.0/r-4.0.2-amdvcpog4ugspqwwx3ari7pzkmckelu6/rlib/R/lib/libRlapack.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] org.Dr.eg.db_3.12.0
[2] RMariaDB_1.2.2
[3] org.Dm.eg.db_3.12.0
[4] TxDb.Drerio.UCSC.danRer11.refGene_3.4.6 [5] GenomicFeatures_1.42.3
[6] AnnotationDbi_1.52.0
[7] BSgenome.Drerio.UCSC.danRer11_1.4.2
[8] BSgenome_1.58.0
[9] rtracklayer_1.50.0
[10] Biostrings_2.58.0
[11] XVector_0.30.0
[12] ArchR_1.0.1
[13] magrittr_2.0.3
[14] rhdf5_2.34.0
[15] Matrix_1.4-1
[16] data.table_1.14.2
[17] SummarizedExperiment_1.20.0
[18] Biobase_2.50.0
[19] GenomicRanges_1.42.0
[20] GenomeInfoDb_1.30.1
[21] IRanges_2.24.1
[22] S4Vectors_0.28.1
[23] BiocGenerics_0.38.0
[24] MatrixGenerics_1.2.1
[25] matrixStats_0.62.0
[26] ggplot2_3.3.6

loaded via a namespace (and not attached): [1] utf8_1.2.2 reticulate_1.25 tidyselect_1.1.2
[4] RSQLite_2.2.14 htmlwidgets_1.5.4 grid_4.0.2
[7] BiocParallel_1.24.1 Rtsne_0.16 devtools_2.4.3
[10] munsell_0.5.0 codetools_0.2-18 ica_1.0-2
[13] future_1.26.1 miniUI_0.1.1.1 withr_2.5.0
[16] spatstat.random_2.2-0 colorspace_2.0-3 progressr_0.10.1
[19] Seurat_4.1.1 ROCR_1.0-11 tensor_1.5
[22] listenv_0.8.0 GenomeInfoDbData_1.2.4 polyclip_1.10-0
[25] bit64_4.0.5 rprojroot_2.0.3 parallelly_1.32.0
[28] vctrs_0.4.1 generics_0.1.2 BiocFileCache_1.14.0
[31] R6_2.5.1 bitops_1.0-7 rhdf5filters_1.2.1
[34] spatstat.utils_2.3-1 cachem_1.0.6 DelayedArray_0.16.3
[37] assertthat_0.2.1 promises_1.2.0.1 scales_1.2.0
[40] rgeos_0.5-9 gtable_0.3.0 Cairo_1.5-15
[43] globals_0.15.0 processx_3.5.3 goftest_1.2-3
[46] rlang_1.0.2 splines_4.0.2 lazyeval_0.2.2
[49] spatstat.geom_2.4-0 BiocManager_1.30.18 reshape2_1.4.4
[52] abind_1.4-5 httpuv_1.6.5 tools_4.0.2
[55] usethis_2.1.5 ellipsis_0.3.2 spatstat.core_2.4-2
[58] RColorBrewer_1.1-3 sessioninfo_1.2.2 ggridges_0.5.3
[61] Rcpp_1.0.8.3 plyr_1.8.7 progress_1.2.2
[64] zlibbioc_1.36.0 purrr_0.3.4 RCurl_1.98-1.7
[67] ps_1.7.0 prettyunits_1.1.1 openssl_2.0.2
[70] rpart_4.1.16 deldir_1.0-6 pbapply_1.5-0
[73] cowplot_1.1.1 zoo_1.8-10 SeuratObject_4.1.0
[76] ggrepel_0.9.1 cluster_2.1.3 fs_1.5.2
[79] scattermore_0.8 lmtest_0.9-40 RANN_2.6.1
[82] fitdistrplus_1.1-8 pkgload_1.2.4 hms_1.1.1
[85] patchwork_1.1.1 mime_0.12 xtable_1.8-4
[88] XML_3.99-0.10 gridExtra_2.3 testthat_3.1.4
[91] compiler_4.0.2 biomaRt_2.46.3 tibble_3.1.7
[94] KernSmooth_2.23-20 crayon_1.5.1 htmltools_0.5.2
[97] mgcv_1.8-40 later_1.2.0 tidyr_1.2.0
[100] lubridate_1.8.0 DBI_1.1.3 dbplyr_2.2.0
[103] rappdirs_0.3.3 MASS_7.3-57 brio_1.1.3
[106] cli_3.3.0 igraph_1.3.1 pkgconfig_2.0.3
[109] GenomicAlignments_1.26.0 sp_1.5-0 plotly_4.10.0
[112] spatstat.sparse_2.1-1 xml2_1.3.3 stringr_1.4.0
[115] callr_3.7.0 digest_0.6.29 sctransform_0.3.3
[118] RcppAnnoy_0.0.19 spatstat.data_2.2-0 leiden_0.4.2
[121] uwot_0.1.11 curl_4.3.2 shiny_1.7.1
[124] Rsamtools_2.6.0 lifecycle_1.0.1 nlme_3.1-157
[127] jsonlite_1.8.0 Rhdf5lib_1.12.1 askpass_1.1
[130] desc_1.4.1 viridisLite_0.4.0 fansi_1.0.3
[133] pillar_1.7.0 lattice_0.20-45 fastmap_1.1.0
[136] httr_1.4.3 pkgbuild_1.3.1 survival_3.3-1
[139] glue_1.6.2 remotes_2.4.2 png_0.1-7
[142] shinythemes_1.2.0 rhandsontable_0.3.8 bit_4.0.4
[145] stringi_1.7.6 blob_1.2.3 memoise_2.0.1
[148] dplyr_1.0.9 irlba_2.3.5 future.apply_1.9.0

rcorces commented 2 years ago

Hi @rustycandlewick! Thanks for using ArchR! Please make sure that your post belongs in the Issues section. Only bugs and error reports belong in the Issues section. Usage questions and feature requests should be posted in the Discussions section, not in Issues.
Before we help you, you must respond to the following questions unless your original post already contained this information: 1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved? 2. Can you recapitulate your error using the tutorial code and dataset? If so, provide a reproducible example. 3. Did you post your log file? If not, add it now. 4. Remove any screenshots that contain text and instead copy and paste the text using markdown's codeblock syntax (three consecutive backticks). You can do this by editing your original post.

rcorces commented 2 years ago

I'm having trouble following your issue post. 1) Please post your log file 2) It seems impossible that the error you have posted is coming from the createArrowFiles() function

rustycandlewick commented 2 years ago

Hi I'm not exactly sure what you mean by the log file but here is the code I ran and the output again:

ArrowFiles <- createArrowFiles(

  • inputFiles = 'atac_fragments.tsv.gz',
  • minTSS = 4, #Dont set this too high because you can always increase later
  • minFrags = 1000,
  • sampleNames= '1',
  • addTileMat = TRUE,
  • addGeneScoreMat = TRUE,
  • geneAnnotation = getGeneAnnotation(geneAnnotation),
  • genomeAnnotation = getGenomeAnnotation(genomeAnnotation)
  • ) Error in .validInput(input = ArchRProj, name = "ArchRProj", valid = c("ArchRProject", : Input value for 'ArchRProj' is not a null,archrproject, (ArchRProj = SimpleList) please supply valid input!
rcorces commented 2 years ago

this is your problem:

getGeneAnnotation(geneAnnotation)

geneAnnotation is not an ArchRProject object. If you created your own geneAnnotation, then just pass that to the parameter.

rustycandlewick commented 2 years ago

Thanks, but the problem is that my geneAnnotation is in an object called geneAnnotation so that should be right if I understand correctly right since I'm calling that object I created?

rustycandlewick commented 2 years ago

I tried the above and renamed the object but the same error. Can we reopen this issue?

ArrowFiles <- createArrowFiles(

  • inputFiles = 'atac_fragments.tsv.gz',
  • minTSS = 4, #Dont set this too high because you can always increase later
  • minFrags = 1000,
  • sampleNames= '1',
  • addTileMat = TRUE,
  • addGeneScoreMat = TRUE,
  • geneAnnotation = getGeneAnnotation(genez),
  • genomeAnnotation = getGenomeAnnotation(genomez)
  • ) Error in .validInput(input = ArchRProj, name = "ArchRProj", valid = c("ArchRProject", : Input value for 'ArchRProj' is not a null,archrproject, (ArchRProj = SimpleList) please supply valid input!
rcorces commented 2 years ago

I've already given the answer. Please read the function documentation for the functions you are attempting to use. getGeneAnnotation() is a function that accepts an ArchRProject object as input. You are calling this function on something that is not an ArchRProject. your command should say geneAnnotation = genez

rustycandlewick commented 2 years ago

I see thanks for the clarification!