neurogenomics / EpiCompare

Comparison, benchmarking & QC of epigenetic datasets
https://doi.org/doi:10.18129/B9.bioc.EpiCompare
12 stars 3 forks source link

`import_narrowPeak` #58

Closed bschilder closed 2 years ago

bschilder commented 2 years ago

Noticed that import_narrowPeak was removed. Recall you mentioning a Bioc reviewer didn't like this function bc it "reinvented the wheel". But did they ever try running an example to prove this?

The reason I added that function is bc rtracklayer is unable to do so, as is ChIPseeker. Both of these return errors:

URL <- "https://www.encodeproject.org/files/ENCFF044JNJ/@@download/ENCFF044JNJ.bed.gz"
encode_ac <- rtracklayer::import(URL)
encode_ac <- rtracklayer::import.bed(URL)
encode_ac <- ChIPseeker::readPeakFile(URL)

If there is an alternative method that works better, happy to use that. But otherwise we should add import_narrowPeak back into EpiCompare so we have a means of importing files from ENCODE and other sources.

Session info

``` R version 4.2.0 (2022-04-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.3.1 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] utf8_1.2.2 tidyselect_1.1.2 [3] htmlwidgets_1.5.4 RSQLite_2.2.14 [5] AnnotationDbi_1.58.0 grid_4.2.0 [7] BiocParallel_1.30.0 scatterpie_0.1.7 [9] munsell_0.5.0 colorspace_2.0-3 [11] GOSemSim_2.22.0 Biobase_2.56.0 [13] filelock_1.0.2 knitr_1.39 [15] rstudioapi_0.13 stats4_4.2.0 [17] DOSE_3.22.0 MatrixGenerics_1.8.0 [19] GenomeInfoDbData_1.2.8 polyclip_1.10-0 [21] seqPattern_1.28.0 bit64_4.0.5 [23] farver_2.1.0 rprojroot_2.0.3 [25] downloader_0.4 vctrs_0.4.1 [27] treeio_1.20.0 generics_0.1.2 [29] xfun_0.31 BiocFileCache_2.4.0 [31] R6_2.5.1 GenomeInfoDb_1.32.1 [33] graphlayouts_0.8.0 locfit_1.5-9.5 [35] bitops_1.0-7 BRGenomics_1.8.0 [37] cachem_1.0.6 fgsea_1.22.0 [39] gridGraphics_0.5-1 DelayedArray_0.22.0 [41] assertthat_0.2.1 promises_1.2.0.1 [43] BiocIO_1.6.0 scales_1.2.0 [45] ggraph_2.0.5 enrichplot_1.16.0 [47] gtable_0.3.0 tidygraph_1.2.1 [49] rlang_1.0.2 genefilter_1.78.0 [51] splines_4.2.0 rtracklayer_1.56.0 [53] lazyeval_0.2.2 impute_1.70.0 [55] plyranges_1.16.0 BiocManager_1.30.17 [57] yaml_2.3.5 reshape2_1.4.4 [59] GenomicFeatures_1.48.0 httpuv_1.6.5 [61] qvalue_2.28.0 clusterProfiler_4.4.1 [63] tools_4.2.0 ggplotify_0.1.0 [65] gridBase_0.4-7 ggplot2_3.3.6 [67] ellipsis_0.3.2 gplots_3.1.3 [69] RColorBrewer_1.1-3 BiocGenerics_0.42.0 [71] Rcpp_1.0.8.3 plyr_1.8.7 [73] progress_1.2.2 zlibbioc_1.42.0 [75] purrr_0.3.4 RCurl_1.98-1.6 [77] prettyunits_1.1.1 viridis_0.6.2 [79] S4Vectors_0.34.0 SummarizedExperiment_1.26.1 [81] ggrepel_0.9.1 here_1.0.1 [83] magrittr_2.0.3 data.table_1.14.2 [85] DO.db_2.9 matrixStats_0.62.0 [87] evaluate_0.15 hms_1.1.1 [89] patchwork_1.1.1 mime_0.12 [91] xtable_1.8-4 XML_3.99-0.9 [93] readxl_1.4.0 IRanges_2.30.0 [95] gridExtra_2.3 compiler_4.2.0 [97] biomaRt_2.52.0 tibble_3.1.7 [99] KernSmooth_2.23-20 crayon_1.5.1 [101] shadowtext_0.1.2 htmltools_0.5.2 [103] tzdb_0.3.0 ggfun_0.0.6 [105] later_1.3.0 tidyr_1.2.0 [107] geneplotter_1.74.0 aplot_0.1.4 [109] DBI_1.1.2 tweenr_1.0.2 [111] ChIPseeker_1.32.0 genomation_1.28.0 [113] dbplyr_2.1.1 MASS_7.3-57 [115] rappdirs_0.3.3 boot_1.3-28 [117] Matrix_1.4-1 readr_2.1.2 [119] cli_3.3.0 parallel_4.2.0 [121] igraph_1.3.1 GenomicRanges_1.48.0 [123] pkgconfig_2.0.3 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 [125] GenomicAlignments_1.32.0 plotly_4.10.0 [127] xml2_1.3.3 ggtree_3.4.0 [129] annotate_1.74.0 XVector_0.36.0 [131] yulab.utils_0.0.4 stringr_1.4.0 [133] digest_0.6.29 Biostrings_2.64.0 [135] cellranger_1.1.0 rmarkdown_2.14 [137] fastmatch_1.1-3 tidytree_0.3.9 [139] EpiCompare_0.99.16 restfulr_0.0.13 [141] curl_4.3.2 shiny_1.7.1 [143] Rsamtools_2.12.0 gtools_3.9.2 [145] rjson_0.2.21 lifecycle_1.0.1 [147] nlme_3.1-157 jsonlite_1.8.0 [149] viridisLite_0.4.0 BSgenome_1.64.0 [151] fansi_1.0.3 pillar_1.7.0 [153] lattice_0.20-45 KEGGREST_1.36.0 [155] fastmap_1.1.0 httr_1.4.3 [157] plotrix_3.8-2 survival_3.3-1 [159] GO.db_3.15.0 interactiveDisplayBase_1.34.0 [161] glue_1.6.2 remotes_2.4.2 [163] UpSetR_1.4.0 png_0.1-7 [165] BiocVersion_3.15.2 bit_4.0.4 [167] ggforce_0.3.3 stringi_1.7.6 [169] blob_1.2.3 DESeq2_1.36.0 [171] org.Hs.eg.db_3.15.0 AnnotationHub_3.4.0 [173] caTools_1.18.2 memoise_2.0.1 [175] dplyr_1.0.9 ape_5.6-2 ```
bschilder commented 2 years ago

Ok, so I see they provided an example here. I wasn't aware of this functionality, and as far as I know it isn't well documented (?import doesn't have any documentation for some reason). But if it works consistently then it makes sense use it.

URL <- "https://www.encodeproject.org/files/ENCFF044JNJ/@@download/ENCFF044JNJ.bed.gz"
encode_ac <- rtracklayer::import(URL, format="narrowPeak")