grimbough / biomaRt

R package providing query functionality to BioMart instances like Ensembl
https://bioconductor.org/packages/biomaRt/
34 stars 13 forks source link

how to force getBM to use https link instead of http #26

Closed vipints closed 3 years ago

vipints commented 3 years ago

Hello biomaRt team,

I have a compute setup where I need to access the external services via a proxy server and the proxy server only support https access. Here is the case:

> library(biomaRt)
> mart_obj = useMart("ensembl", host="https://www.ensembl.org", dataset="hsapiens_gene_ensembl", port = 443)
> id = '075534'
> bm = getBM(attributes=c("peptide", "ensembl_transcript_id", "description", "ensembl_gene_id", "hgnc_symbol", "gene_biotype", "uniprotswissprot"), filters="uniprotswissprot",values=id, mart=mart_obj)
Error: failed to load external entity "http://www.ensembl.org/info/website/archives/index.html?redirect=no"

I am using R version 3.6.1 and biomaRt_2.42.0. Is there any chance I can force the getBM function to use https ensembl link to get access to the data?

Thanks in advance!

grimbough commented 3 years ago

Hi,

Thanks for the interested in the package, and for highlighting this bug. I hadn't realised that http was always used for finding the list of archives, regardless of which options are passed to useMart().

I've made a small update, which you can try from here on GitHub via:

BiocManager::install('grimbough/biomaRt')

You code above will hopefully then work. I'll also point out that you can create the mart object with useEnsembl() which has some Ensembl specific defaults set, including using HTTPS and port 443, so you shouldn't have to set them manually e.g.

mart_ob <- useEnsembl(biomart = "ensembl", dataset = 'hsapiens_gene_ensembl')

Let me know if it works for you, I've been experiencing some HTTPS issues with recent versions of Ubuntu and the Ensembl site that may also be playing a part.

vipints commented 3 years ago

Thanks @grimbough for the suggestion. I have updated the biomaRt package to the version you mentioned. This helps to avoid the error message with http links.

> library(biomaRt)
> packageVersion("biomaRt") 
[1] ‘2.45.4’
> mart_obj = useEnsembl(biomart="ensembl", dataset="hsapiens_gene_ensembl")
> id = '075534'
> bm = getBM(attributes=c("peptide", "ensembl_transcript_id", "description", "ensembl_gene_id", "hgnc_symbol", "gene_biotype", "uniprotswissprot"), filters="uniprotswissprot",values=id, mart=mart_obj)
Warning messages:
1: `select_()` is deprecated as of dplyr 0.7.0.
Please use `select()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
2: `filter_()` is deprecated as of dplyr 0.7.0.
Please use `filter()` instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
>