CharlesJB / ENCODExplorer

5 stars 4 forks source link

Support Drosophila melanogaster? #52

Open ashleymaeconard opened 4 years ago

ashleymaeconard commented 4 years ago

Hello! I have enjoyed using ENCODExplorer but I would like to query for Drosophila melanogaster. I tried using the organism, assembly, and biosample_type. Do you support other organisms besides Homo sapiens and Mus musculus? I have been following: https://bioconductor.org/packages/release/bioc/vignettes/ENCODExplorer/inst/doc/ENCODExplorer.html and http://bioconductor.org/packages/release/bioc/manuals/ENCODExplorer/man/ENCODExplorer.pdf

ashleymaeconard commented 4 years ago

I should be more clear - I see > query<-queryEncode(organism="Drosophila melanogaster") Results : 11757 files, 503 datasets

however when for example searching: grep("cwo", query$target) integer(0) even though on encodeproject.org I do have results: namely ENCSR900TNL

ericfournier2 commented 4 years ago

Hi, if you try:

queryEncodeGeneric(accession="ENCSR900TNL")

you will find that ENCODE lists the target for those experiments as "eGFP-cwo", which means that exact searches for target="cwo" will fail. Using fuzzy search would solve this issue:

queryEncodeGeneric(organism="Drosophila melanogaster", target="cwo", fuzzy=TRUE)

This works on my end.

As an aside, if you're looking for peak information rather than raw ChIP data, you might find the queryConsensusPeaks and buildConsensusPeaks functions useful.

Cheers,

ashleymaeconard commented 4 years ago

Hello again! Thank you for your prompt response! For ENCODExplorer_2.4.0 I do not have queryEncodeGeneric. I am using R 3.6 and tried installing through bioconductor, then conda. Neither worked. I followed this closed post to install ENCODExplorer_2.4.0:https://github.com/CharlesJB/ENCODExplorer/issues/42#event-2391438780

> library(ENCODExplorer)
> query
query        query_res    queryEncode
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS/LAPACK: /Users/ashleymaeconard/anaconda2/envs/timeor_env2/lib/R/lib/libRblas.dylib

Random number generation:
 RNG:     Mersenne-Twister
 Normal:  Inversion
 Sample:  Rounding

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] purrr_0.3.3         modeest_2.4.0       dplyr_0.8.4
 [4] stringr_1.4.0       RcisTarget_1.6.0    data.table_1.12.8
 [7] visNetwork_2.0.9    doRNG_1.8.2         rngtools_1.5
[10] doMC_1.3.6          iterators_1.0.12    foreach_1.4.8
[13] ENCODExplorer_2.4.0 shinythemes_1.1.2   DT_0.12
[16] shiny_1.3.2

loaded via a namespace (and not attached):
 [1] bitops_1.0-6                matrixStats_0.55.0
 [3] bit64_0.9-7                 GenomeInfoDb_1.22.0
 [5] fBasics_3042.89             tools_3.6.1
 [7] utf8_1.1.4                  R6_2.4.1
 [9] rpart_4.1-15                DBI_1.1.0
[11] BiocGenerics_0.32.0         tidyselect_1.0.0
[13] timeSeries_3062.100         bit_1.1-15.2
[15] compiler_3.6.1              AUCell_1.8.0
[17] cli_2.0.1                   graph_1.64.0
[19] Biobase_2.46.0              DelayedArray_0.12.2
[21] spatial_7.3-11              digest_0.6.24
[23] R.utils_2.9.2               XVector_0.26.0
[25] pkgconfig_2.0.3             htmltools_0.4.0
[27] stabledist_0.7-1            htmlwidgets_1.5.1
[29] rlang_0.4.4                 RSQLite_2.2.0
[31] zoo_1.8-7                   jsonlite_1.6.1
[33] crosstalk_1.0.0             BiocParallel_1.20.1
[35] R.oo_1.23.0                 RCurl_1.95-4.12
[37] magrittr_1.5                feather_0.3.5
[39] GenomeInfoDbData_1.2.2      Matrix_1.2-18
[41] Rcpp_1.0.3                  S4Vectors_0.24.3
[43] fansi_0.4.1                 lifecycle_0.1.0
[45] R.methodsS3_1.8.0           stringi_1.4.3
[47] yaml_2.2.1                  SummarizedExperiment_1.16.1
[49] stable_1.1.4                zlibbioc_1.32.0
[51] grid_3.6.1                  blob_1.2.1
[53] promises_1.1.0              crayon_1.3.4
[55] lattice_0.20-38             annotate_1.64.0
[57] hms_0.5.3                   pillar_1.4.3
[59] GenomicRanges_1.38.0        statip_0.2.3
[61] codetools_0.2-16            rmutil_1.1.3
[63] stats4_3.6.1                XML_3.98-1.19
[65] glue_1.3.1                  vctrs_0.2.2
[67] httpuv_1.5.1                tidyr_1.0.2
[69] clue_0.3-57                 assertthat_0.2.1
[71] mime_0.9                    xtable_1.8-4
[73] later_1.0.0                 timeDate_3043.102
[75] tibble_2.1.3                AnnotationDbi_1.48.0
[77] memoise_1.1.0               IRanges_2.20.2
[79] cluster_2.1.0               GSEABase_1.48.0

In the tutorial I see that fuzzy is used as a parameter to queryEncode() however when looking at the code itself and trying it for ENCODExplorer_2.4.1 it is not a parameter that I can use:

> queryEncode
function (df = NULL, set_accession = NULL, assay = NULL, biosample_name = NULL,
    dataset_accession = NULL, file_accession = NULL, file_format = NULL,
    lab = NULL, organism = NULL, target = NULL, treatment = NULL,
    project = NULL, biosample_type = NULL, file_status = "released",
    status = "released", fixed = TRUE, quiet = FALSE)
...
ericfournier2 commented 4 years ago

The latest release of BioConductor needs R >= 3.6, so there is no reason you shouldn't be able to installl it. I think you may have confused BioConductor release 3.6 and R release 3.6.

You can find the instructions for installing ENCODExplorer on the BioConductor page: https://bioconductor.org/packages/release/bioc/html/ENCODExplorer.html

ashleymaeconard commented 3 years ago

Hi, do you have a version of ENCODExplorer that is compatible for R 3.6 which has queryEncodeGeneric? I had loaded ENCODExplorer 2.12.1 and it worked fine to use queryEncodeGeneric but now it seems that it has been removed from bioconductor https://bioconductor.org/help/search/index.html?q=ENCODExplorer/. And thank you for your comment about the BioConductor release 3.6 vs. R release 3.6. I am aware that they are different.