grimbough / biomaRt

R package providing query functionality to BioMart instances like Ensembl
https://bioconductor.org/packages/biomaRt/
34 stars 13 forks source link

getBM() error: Could not resolve host: NA #10

Closed Gurlaz closed 5 years ago

Gurlaz commented 5 years ago

The function getBM() gives me following error:

Error in curl::curl_fetch_memory(url, handle = handle) : 
  Could not resolve host: NA

My script looks like this:

library(biomaRt)
listMarts()
listMarts(host = "biomart.vectorbase.org")
vectorbase_gene <- useMart(biomart = "vb_gene_mart_1810", host = "biomart.vectorbase.org")
mysets <- listDatasets(vectorbase_gene)
mysets
mydataset <- mysets$dataset[mysets$dataset == "astephensi_eg_gene"]
myusemart <- useDataset(as.character(mydataset), mart = vectorbase_gene)
allattributes <- listAttributes(mart = myusemart)
allattributes
resultTable <- getBM(attributes = c("ensembl_gene_id","ensembl_transcript_id", "start_position", "end_position","go_id","name_1006"), mart = myusemart, verbose = TRUE)
resultTable[1:10]

I have tried following https://github.com/grimbough/biomaRt/issues/3 to get individual attributes:

resultTable1 <- getBM(attributes = "ensembl_gene_id", mart = myusemart, verbose = TRUE)

But the error remains the same. I manually checked the website on vector base for biomart, it works fine. I do not understand, if the error is from my end or the website connection timeout. My session info is:

sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomaRt_2.39.2       BiocInstaller_1.30.0 org.Hs.eg.db_3.7.0   AnnotationDbi_1.44.0 IRanges_2.16.0      
[6] S4Vectors_0.20.1     Biobase_2.42.0       BiocGenerics_0.28.0 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0        compiler_3.5.1    prettyunits_1.0.2 bitops_1.0-6      remotes_2.0.2     tools_3.5.1      
 [7] progress_1.2.0    pkgbuild_1.0.2    digest_0.6.18     bit_1.1-14        RSQLite_2.1.1     memoise_1.1.0    
[13] pkgconfig_2.0.2   rlang_0.3.1       cli_1.0.1         DBI_1.0.0         rstudioapi_0.9.0  curl_3.3         
[19] yaml_2.2.0        withr_2.1.2       stringr_1.3.1     httr_1.4.0        hms_0.4.2         rprojroot_1.3-2  
[25] bit64_0.9-7       R6_2.3.0          processx_3.2.1    XML_3.98-1.16     callr_3.1.1       blob_1.1.1       
[31] magrittr_1.5      backports_1.1.3   ps_1.3.0          assertthat_0.2.0  stringi_1.2.4     RCurl_1.95-4.11  
[37] crayon_1.3.4

Any ideas are highly appreciated. Thanks!

cerdelyan commented 5 years ago

I've just tried myself to get biomaRt working with Vectorbase. It will list datasets but won't work with getBM. The manual has a section on other biomarts using Wormbase as an example.

https://www.bioconductor.org/packages/devel/bioc/vignettes/biomaRt/inst/doc/biomaRt.html#using-a-biomart-other-than-ensembl

listMarts(host = "biomart.vectorbase.org")

vectorbase = useMart(biomart = "vb_gene_mart_1902", 
                     host = "https://biomart.vectorbase.org", 
                     port = 443)
listDatasets(vectorbase)

vectorbase <- useDataset(mart = vectorbase, dataset = "alvpagwg_eg_gene") #Aedes - change for other organisms

go_ids = getBM(attributes = c("go_id", "external_gene_name", "ensembl_gene_id","refseq_dna"), 
               filters="external_gene_name", 
               values=geneUniverse, 
               mart=vectorbase,
               verbose = TRUE)

Before I saw the manual I found @grimbough gist to test if it was on my end or the server and play with urls: https://gist.github.com/grimbough/1e44fbe7ab4e3638671ef5e11e7128db I just copied the XML output from vectorbase's biomart and replaced some info.

I skimmed the biomart manual for the version vectorbase uses (0.7) http://www.biomart.org/other/user-docs.pdf

I just saw today if anyone comes across this, grimbough posted the same for Ensembl Fungi data using biomaRt:

https://support.bioconductor.org/p/116770/#116775

4

Gurlaz commented 5 years ago

Thank you @cerdelyan, I tried your method and it works. Changing the host name to full and giving it port 443 was the trick. Thanks to you I have my GO ids now.

grimbough commented 5 years ago

Thanks @cerdelyan for answering this.