Bioconductor / AnnotationForge

Tools for building SQLite-based annotation data packages
https://bioconductor.org/packages/AnnotationForge
4 stars 9 forks source link

NCBI ftp: download fails #59

Open tghazanchyan opened 1 month ago

tghazanchyan commented 1 month ago

In certain environments, access to external resources via the insecure FTP protocol is entirely blocked. Could you kindly provide an option to switch to the HTTP protocol for downloading data from NCBI? NCBI offers symmetrical access to the same data over both FTP and HTTP.

hpages commented 1 month ago

Please show the code that fails as well as your sessionInfo()

jmacdon commented 1 month ago

I believe OP is talking about this:

.downloadData <- function(file, tax_id, NCBIFilesDir, rebuildCache, verbose)
{
    if (verbose)
        message("getting data for ", names(file))

    ## NCBI connection
    if (is.null(NCBIFilesDir)) {
        NCBIcon <- dbConnect(SQLite(), dbname = tempfile())
        tmp <- tempfile()
    } else {
        NCBIcon <- dbConnect(SQLite(),
                             dbname = file.path(NCBIFilesDir, "NCBI.sqlite"))
        tmp <- file.path(NCBIFilesDir, names(file))
    }
    tableName <- sub(".gz","",names(file))

    ## download
    if (rebuildCache) {
        if(names(file) == "gene2unigene"){
            url <- paste0("ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ARCHIVE/", names(file))
        }else{
            url <- paste0("ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/", names(file))
        }
tghazanchyan commented 1 month ago

Thanks @jmacdon ! Yes, I was referring to the hardcoded FTP links. It would be great to have a parameter that allows for selecting between FTP and HTTP.