myles-lewis / locuszoomr

A pure R implementation of locuszoom for plotting genetic data at genomic loci accompanied by gene annotations.
GNU General Public License v3.0
18 stars 5 forks source link

error with link_recomb #14

Closed lalalammy closed 6 months ago

lalalammy commented 7 months ago

I'm getting the following error message when using the link_recomb function:

APP.recomb <- link_recomb(APP.test) Retrieving recombination data from UCSC Error in normArgTable(value, x) : Table 'recomb1000GAvg' is unavailable

myles-lewis commented 7 months ago

Hi lalalammy,

Difficult to help without a reprex (reproducible example). Can you either share the APP.test object (via link to dropbox, onedrive etc) or produce longer code which reproduces the error every time. What was the code for your initial call to locus()? Also, what OS are you on, what version R and what version locuszoomr etc (maybe provide sessionInfo() printout)? Is your error reproducible every time?

Bw, Myles

lalalammy commented 7 months ago

Hi, thanks for your response and thank you in advance for your help.

The sessionInfo() output is at the bottom of this post. I am also sharing a link to the .Rdata file containing the locus object itself. The code for the initial call to create the locus is below

ah <- AnnotationHub()
query(ah, c("EnsDb", "Homo sapiens"))

ensDb_v110 <- ah[["AH113665"]]

APP.test <- locus(data=as.data.frame(app.data.wRSIDS), 
                           gene="APP", 
                           chrom = "CHR", 
                           pos="POS", 
                           p="p", 
                           labs="refsnp_id", index_snp = "rs463946",
                           ens_db=ensDb_v110)

And to answer your question of if this error was reproducible: yes, I got this same error when trying to plot another locus during an entirely different work session, but using a similar approach.

sessionInfo() R version 4.3.2 (2023-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C LC_TIME=English_United States.utf8

time zone: America/New_York tzcode source: internal

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] ensembldb_2.26.0 AnnotationFilter_1.26.0 GenomicFeatures_1.54.1 AnnotationDbi_1.64.1 Biobase_2.62.0
[6] GenomicRanges_1.54.1 GenomeInfoDb_1.38.1 IRanges_2.36.0 S4Vectors_0.40.2 biomaRt_2.58.0
[11] AnnotationHub_3.10.0 BiocFileCache_2.10.1 dbplyr_2.4.0 BiocGenerics_0.48.1 locuszoomr_0.2.0
[16] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
[21] readr_2.1.4 tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.4 tidyverse_2.0.0

loaded via a namespace (and not attached): [1] rstudioapi_0.15.0 jsonlite_1.8.8 magrittr_2.0.3 rmarkdown_2.25
[5] BiocIO_1.12.0 zlibbioc_1.48.0 vctrs_0.6.4 memoise_2.0.1
[9] Rsamtools_2.18.0 RCurl_1.98-1.13 base64enc_0.1-3 htmltools_0.5.7
[13] S4Arrays_1.2.0 progress_1.2.2 curl_5.1.0 SparseArray_1.2.2
[17] Formula_1.2-5 htmlwidgets_1.6.3 plotly_4.10.3 zoo_1.8-12
[21] cachem_1.0.8 gggrid_0.2-0 GenomicAlignments_1.38.0 mime_0.12
[25] lifecycle_1.0.4 pkgconfig_2.0.3 Matrix_1.6-4 R6_2.5.1
[29] fastmap_1.1.1 shiny_1.8.0 GenomeInfoDbData_1.2.11 MatrixGenerics_1.14.0
[33] digest_0.6.33 colorspace_2.1-0 ggsurvfit_1.0.0 Hmisc_5.1-1
[37] RSQLite_2.3.3 filelock_1.0.2 fansi_1.0.5 timechange_0.2.0
[41] httr_1.4.7 abind_1.4-5 compiler_4.3.2 bit64_4.0.5
[45] withr_2.5.2 htmlTable_2.4.2 backports_1.4.1 BiocParallel_1.36.0
[49] DBI_1.1.3 rappdirs_0.3.3 DelayedArray_0.28.0 rjson_0.2.21
[53] tools_4.3.2 foreign_0.8-85 interactiveDisplayBase_1.40.0 httpuv_1.6.12
[57] nnet_7.3-19 glue_1.6.2 restfulr_0.0.15 promises_1.2.1
[61] grid_4.3.2 checkmate_2.3.0 cluster_2.1.4 generics_0.1.3
[65] LDlinkR_1.3.0 gtable_0.3.4 tzdb_0.4.0 data.table_1.14.8
[69] hms_1.1.3 xml2_1.3.6 utf8_1.2.4 XVector_0.42.0
[73] BiocVersion_3.18.1 pillar_1.9.0 later_1.3.1 splines_4.3.2
[77] lattice_0.22-5 survival_3.5-7 rtracklayer_1.62.0 bit_4.0.5
[81] tidyselect_1.2.0 Biostrings_2.70.1 knitr_1.45 gridExtra_2.3
[85] ProtGenerics_1.34.0 SummarizedExperiment_1.32.0 xfun_0.41 matrixStats_1.1.0
[89] stringi_1.8.1 lazyeval_0.2.2 yaml_2.3.7 evaluate_0.23
[93] codetools_0.2-19 BiocManager_1.30.22 cli_3.6.1 rpart_4.1.21
[97] xtable_1.8-4 munsell_0.5.0 Rcpp_1.0.11 png_0.1-8
[101] XML_3.99-0.16 parallel_4.3.2 ellipsis_0.3.2 blob_1.2.4
[105] prettyunits_1.2.0 bitops_1.0-7 viridisLite_0.4.2 scales_1.3.0
[109] crayon_1.5.2 rlang_1.1.2 cowplot_1.1.2 KEGGREST_1.42.0 `

myles-lewis commented 7 months ago

Thanks for the rdata object. I've tried this several times (resetting the cache each time) on my system (mac) and it has not failed yet. It sounds like this is an intermittent error. I think this is not an error in locuszoomr. The error occurs with the functions ucscTableQuery() and getTable() which are part of the rtracklayer package. Perhaps it is a problem that sometimes occurs with the API link to the UCSC server?

I have added code to link_recomb(), so that these errors fail gracefully (the original 'locus' object is returned) to allow users to use for loops etc to request recombination data on multiple loci. I suggest that you simply call link_recomb() again for those loci which have failed (you need to wait for the new version on github as otherwise the NULL value will simply be recalled via memoise). They will have NULL in the $recomb slot of the 'locus' object.

I have added code to erase the memoise cache when an error occurs so that the request to the API can be renewed.

nickhir commented 7 months ago

Just to add to this, I have also noticed some odd behavior of of link_recomb(). Since a few days, it retruns the following warning:

Warning messages:
1: In curlSetOpt(..., .opts = .opts, curl = h, .encoding = .encoding) :
  Error setting the option for # 3 (status = 43) (enum = 81) (value = 0x5625c6559530): A libcurl function was given a bad argument CURLOPT_SSL_VERIFYHOST no longer supports 1 as value!
2: In curlSetOpt(..., .opts = .opts, curl = h, .encoding = .encoding) :
  Error setting the option for # 3 (status = 43) (enum = 81) (value = 0x5625c76f9260): A libcurl function was given a bad argument CURLOPT_SSL_VERIFYHOST no longer supports 1 as value!

I am still able to retrieve the linkage data (and it gets plotted) but I cant cache it (i.e. this warning shows up each call).

myles-lewis commented 7 months ago

Yes this behaviour for the cache via memoise is deliberate and means that errors will keep getting printed. The issue is that sometimes the API throws an error which I can only assume is to do with the server being busy etc. But in this case the return value is NULL as nothing was returned from the UCSC server. So in order to cope with this, the function has been designed to remove the cache entry (whether NULL or not) if there is an error as it assumes that the user might wish to query UCSC again to retrieve the data. This is a workaround. But it means that errors which are "ok" will keep getting printed and link_recomb() will query the server every time for an error generating query rather than using the cache. There's not much I can do about this as it is more desirable to be able to simply rerun a large batch of queries and fill in the ones that are missing due to a temporary server glitch. For example I've noticed that for hg19 there isn't any recombination data for chromosome X but that's probably a genuine result so the error returned cannot be surmounted and will occur every time calls to chromosome X are repeated. I hope that explains the logic of the current way the function is written.

It looks like the warning you get is different as a result is still generated. Can you try reinstalling locuszoomr and updating all of the relevant packages and seeing if the error goes away with the latest version of the ancillary packages.