ropensci / rnaturalearth

An R package to hold and facilitate interaction with natural earth map data :earth_africa:
http://ropensci.github.io/rnaturalearth/
Other
217 stars 23 forks source link

ne_download URL strange/wrong #29

Closed rix133 closed 1 year ago

rix133 commented 5 years ago

So after clean install using the latest "rnaturalearth" on Windows 7 (R 3.5.3):

The URL seems strange to me: http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip

Command that I ran: countries10 <- ne_download(scale = 10, type = 'countries', category = 'cultural', returnclass = "sf")

Error in utils::download.file(file.path(address), zip_file <- tempfile()) : cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip'
2.
utils::download.file(file.path(address), zip_file <- tempfile())
1.
ne_download(scale = 10, type = "countries", category = "cultural", returnclass = "sf")
Nowosad commented 5 years ago

Is the link www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip works in your browser?

andysouth commented 5 years ago

The command works from R for me and as @Nowosad points out should work in browser too. (but you are right it does look a bit strange). Must be another download issue ?

rix133 commented 5 years ago

Is the link www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip works in your browser?

Actually it does, so it can't be a firewall or some other problem, I guess. So I have no idea ? I must investigate further. It fails with a clean install of 3.6.0 as well with the same message.

andysouth commented 5 years ago

The code saves the downloaded file to a temporary location, might be worth checking that this works for you : write.csv(data.frame(), tempfile())

andysouth commented 5 years ago

Try this to check connection from R: library(curl) curl::has_internet()

rix133 commented 5 years ago

Try this to check connection from R: library(curl) curl::has_internet()

TRUE

write.csv(data.frame(), tempfile()) gives no error

Furthermore other urls work i.e: utils::download.file(file.path('https://file-examples.com/wp-content/uploads/2017/02/zip_2MB.zip'), zip_file <- tempfile())

yields: trying URL 'https://file-examples.com/wp-content/uploads/2017/02/zip_2MB.zip' Content type 'application/zip' length 2036861 bytes (1.9 MB) downloaded 1.9 MB

downloading using http instead of https of this URL works as well.

rix133 commented 5 years ago

UPDATE: It works if I specify download method to in download to libcurl i.e: utils::download.file(file.path(address), zip_file <- tempfile(), method = "libcurl")

So I can specify global options for the ne_download:

options("download.file.method" = "libcurl")
countries10 <- ne_download(scale = 10, type = 'countries', category = 'cultural', returnclass = "sf")

yields

(trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip'
Content length 283 bytes
downloaded 4.7 MB)

Should I close the issue?

andysouth commented 5 years ago

Thanks Richard, Good work finding a solution. Leave it open for now while I ask twitter if it's wise to set that option in the package.

barryrowlingson commented 5 years ago

download.file uses this code to decide on the method:

    method <- if (missing(method)) 
        getOption("download.file.method", default = "auto")
    else match.arg(method, c("auto", "internal", "libcurl", "wget", 
        "curl", "lynx"))
    if (method == "auto") {
        if (length(url) != 1L || typeof(url) != "character") 
            stop("'url' must be a length-one character vector")
        method <- if (grepl("^file:", url)) 
            "internal"
        else "libcurl"
    }

so if method isn't supplied and download.file.method isn't set then method gets the default of auto and then the method is "internal" if the URL starts with file: or libcurl otherwise. So if your download works with explicit method="libcurl" but not with a missing method argument then the default is being got from somewhere. Sure you haven't set the download.file.method to something else? Something that breaks on double-slashes in a URL? That's the only odd thing in that URL... Maybe it parses the string up to the second // instead of the first?

rix133 commented 5 years ago

Sure you haven't set the download.file.method to something else?

So the error appears both in Rstudio console and R console. The options("download.file.method") returns: NULL in R console "wininet"`` in latest RStudio

I looked at the method definition on R3.5.3 (windows) and it seems to default to "wininet" if auto: snippet from download.file

if (method == "auto") {
    if (length(url) != 1L || typeof(url) != "character") 
      stop("'url' must be a length-one character vector")
    method <- if (grepl("^ftps:", url) && capabilities("libcurl")) 
      "libcurl"
    else "wininet"
  }
petrpajdla commented 3 years ago

Having the same issue on Linux, R version 4.0.3 with rnaturalearth version 0.1.0 and 0.2.0.

options("download.file.method")
$download.file.method
[1] "libcurl"

ne_download() returs weird link (no problem with internet connection, curl::has_internet() returns TRUE, the link does not work in the browser either w/o firewall, wordpress returns There has been a critical error on your website.)

ne_download(scale = 10, type = 'rivers_lake_centerlines', category = 'physical', destdir = destdir, load = FALSE)

rnaturalearth version 0.1.0 returns:

trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_rivers_lake_centerlines.zip'
Error in utils::download.file(file.path(address), zip_file <- tempfile()) : 
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_rivers_lake_centerlines.zip'

and rnaturalearth version 0.2.0 returns http status 500:

trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/raster/GRAY_HR_SR.zip'
download failed
NULL
Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/raster/GRAY_HR_SR.zip': HTTP status was '500 Internal Server Error'

Any ideas ho to solve this yet?

rmgriffin commented 3 years ago

Same issue here

Grelot commented 3 years ago

Same issue with Windows 10 rnaturalearth 0.1.0. and r 4.3

trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/physical/ne_50m_coastline.zip'
Error in utils::download.file(file.path(address), zip_file <- tempfile()) : 
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/physical/ne_50m_coastline.zip'
In addition: Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  InternetOpenUrl failed: 'The operation timed out
bienflorencia commented 3 years ago

I believe is related to the 'http' being used, instead of 'https' in the ne_file_name function. Is there a way to change this locally so it works?

function (scale = 110, type = "countries", category = c("cultural", 
  "physical", "raster"), full_url = FALSE) 
{
  scale <- check_scale(scale)
  category <- match.arg(category)
  if (type %in% c("countries", "map_units", "map_subunits", 
    "sovereignty", "tiny_countries", "boundary_lines_land", 
    "pacific_groupings", "breakaway_disputed_areas", "boundary_lines_disputed_areas", 
    "boundary_lines_maritime_indicator")) {
    type <- paste0("admin_0_", type)
  }
  if (type == "states") 
    type <- "admin_1_states_provinces_lakes"
  if (category == "raster") {
    file_name <- paste0(type)
  }
  else {
    file_name <- paste0("ne_", scale, "m_", type)
  }
  if (full_url) 
    file_name <- paste0("http://www.naturalearthdata.com/http//", 
      "www.naturalearthdata.com/download/", scale, "m/", 
      category, "/", file_name, ".zip")
  return(file_name)
}

Edit: I changed that in the function and it's still not working. Can it be something related to firewall?

jmarshallnz commented 3 years ago

It appears to be a new issue on the website. If you go directly to the website and browse, you get a link the same as in the R package (though https:// vs http://). If you click on the link it invokes an onclick event that goes through urchinTracker. If you copy the link into a browser it fails with a Wordpress problem. Possibly some issue with redirection?

Thus, the ne_download() function is currently broken in the R package. It seems that the natural earth data folk have been made aware of this, see here: https://github.com/nvkelso/natural-earth-vector/issues/528

nvkelso commented 3 years ago

This should be fixed now.

dlebauer commented 3 years ago

I just ran into this error with R 4.0.2, rnaturalearth 0.1.0 (https://github.com/ropensci/rnaturalearth/commit/543e3cbc2c913724ed66e742f0b8c38828ef1002) current version in the repository), Windows 10

urban_areas <- rnaturalearth::ne_download(scale = 'large', type = 'urban_areas', returnclass = 'sf')
#> Warning in utils::download.file(file.path(address), zip_file <- tempfile()):
#> InternetOpenUrl failed: 'The server name or address could not be resolved'
#> Error in utils::download.file(file.path(address), zip_file <- tempfile()): cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_urban_areas.zip'

Created on 2021-09-24 by the reprex package (v2.0.0)

Session info ``` r sessioninfo::session_info() #> - Session info --------------------------------------------------------------- #> setting value #> version R version 4.0.2 (2020-06-22) #> os Windows 10 x64 #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_United States.1252 #> ctype English_United States.1252 #> tz America/Phoenix #> date 2021-09-24 #> #> - Packages ------------------------------------------------------------------- #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2) #> backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.3) #> class 7.3-17 2020-04-26 [2] CRAN (R 4.0.2) #> classInt 0.4-3 2020-04-07 [1] CRAN (R 4.0.2) #> cli 2.3.1 2021-02-23 [1] CRAN (R 4.0.4) #> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.2) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.0.3) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.3) #> dplyr 1.0.4 2021-02-02 [1] CRAN (R 4.0.3) #> e1071 1.7-6 2021-03-18 [1] CRAN (R 4.0.4) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2) #> fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.3) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.3) #> generics 0.1.0 2020-10-31 [1] CRAN (R 4.0.3) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2) #> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.3) #> KernSmooth 2.23-17 2020-04-26 [2] CRAN (R 4.0.2) #> knitr 1.31 2021-01-27 [1] CRAN (R 4.0.3) #> lattice 0.20-41 2020-04-02 [2] CRAN (R 4.0.2) #> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.4) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.3) #> pillar 1.5.1 2021-03-05 [1] CRAN (R 4.0.4) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2) #> proxy 0.4-25 2021-03-05 [1] CRAN (R 4.0.4) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.2) #> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.3) #> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.0.5) #> reprex 2.0.0 2021-04-02 [1] CRAN (R 4.0.5) #> rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.3) #> rmarkdown 2.7 2021-02-19 [1] CRAN (R 4.0.4) #> rnaturalearth 0.1.0 2017-03-21 [1] CRAN (R 4.0.5) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2) #> sf 0.9-7 2021-01-06 [1] CRAN (R 4.0.4) #> sp 1.4-5 2021-01-10 [1] CRAN (R 4.0.3) #> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.3) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) #> styler 1.3.2 2020-02-23 [1] CRAN (R 4.0.2) #> tibble 3.1.0 2021-02-25 [1] CRAN (R 4.0.4) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.2) #> units 0.7-1 2021-03-16 [1] CRAN (R 4.0.4) #> utf8 1.2.1 2021-03-12 [1] CRAN (R 4.0.5) #> vctrs 0.3.6 2020-12-17 [1] CRAN (R 4.0.3) #> withr 2.4.1 2021-01-26 [1] CRAN (R 4.0.3) #> xfun 0.20 2021-01-06 [1] CRAN (R 4.0.3) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2) #> #> [1] C:/Users/David/Documents/lib/R #> [2] C:/Program Files/R/R-4.0.2/library ```
nvkelso commented 3 years ago

There's a GIST showing where to find the files on S3.

dlebauer commented 3 years ago

@nvkelso should the package be updated to use the new urls? Or are the currently used URLs expected to come back online?

nvkelso commented 3 years ago

For CI and build systems you should switch to the direct S3 URLs as AWS sponsors now via their public data program.

On Sep 29, 2021, at 10:23, David LeBauer @.***> wrote:

 @nvkelso should the package be updated to use the new urls? Or are the currently used URLs expected to come back online?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

nmarchio commented 2 years ago

Encountering the same bug and I believe there is still an issue with the URLs on the website. I am able to manually download from the website, but this could be related to the onclick event mentioned earlier in the thread.

This code: ocean <- ne_download(type = 'ocean', scale = 'large', category = 'physical', returnclass='sf')

Returns:

trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_ocean.zip'
Error in utils::download.file(file.path(address), zip_file <- tempfile()) : 
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_ocean.zip'
In addition: Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_ocean.zip': HTTP status was '500 Internal Server Error'

Further when I try to download directly using the following code (which works on other files):

filedir <- paste0(tempdir())
unlink(filedir, recursive = TRUE)
dir.create(filedir)
ocean_shp <- paste0('https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip')
download.file(url = ocean_shp, destfile = paste0(filedir, basename(ocean_shp)))
unzip(paste0(filedir,basename(ocean_shp)), exdir= filedir)
list.files(path = filedir)

I get a similar error:

trying URL 'https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip'
Error in download.file(url = ocean_shp, destfile = paste0(filedir, basename(ocean_shp))) : 
  cannot open URL 'https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip'
In addition: Warning message:
In download.file(url = ocean_shp, destfile = paste0(filedir, basename(ocean_shp))) :
  cannot open URL 'https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip': HTTP status was '500 Internal Server Error'
nvkelso commented 2 years ago

I suspect this was during a rare maintenance window on the Natural Earth server. 500s are server errors. Downloading that link works for me today.

If you switch over to the S3 URLs than that is much less likely to affect you.

jbenjamin-rms commented 1 year ago

This issue has come up for me today, after years of using this package without issues. Both on Windows and Linux, have posted the Windows session information below.

R version 4.2.1 (2022-06-23 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19044) rnaturalearth_0.1.0

If I try to use the ne_download function for any file (and even using the defaults) I get the following errors:

ne_download()
trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip'
Error in utils::download.file(file.path(address), zip_file <- tempfile()) : 
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip'
In addition: Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  cannot open URL 'https://www.naturalearthdata.com/http/www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip': HTTP status was '404 Not Found'
nvkelso commented 1 year ago

I'm not sure if &/or why this changed, but note the http/ versus http// in the URL path.

In any event, please switch over to the S3 links.

jbenjamin-rms commented 1 year ago

I did note that - I can switch over to using the S3 links, that's no problem, I was just wondering whether the functions in the package would be updated to reflect these changes as they offer a more convenient/streamlined way of downloading and using the data for my use-case. Thank you!

jaum20 commented 1 year ago

More than 3 years past and this bug still exists:

urban = try(rnaturalearth::ne_download(scale = 'medium', type = 'urban_areas', category = 'cultural'), silent = TRUE)
tentando a URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/cultural/ne_50m_urban_areas.zip'
Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  cannot open URL 'https://www.naturalearthdata.com/http/www.naturalearthdata.com/download/50m/cultural/ne_50m_urban_areas.zip': HTTP status was '404 Not Found'
PMassicotte commented 1 year ago

At some point, you should update packages...

PMassicotte commented 1 year ago
rnaturalearth::ne_download(scale = "medium", type = "urban_areas", category = "cultural", returnclass = "sf")
#> Simple feature collection with 2143 features and 4 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -157.984 ymin: -46.26844 xmax: 174.97 ymax: 69.35127
#> Geodetic CRS:  WGS 84
#> # A tibble: 2,143 × 5
#>    scalerank featurecla area_sqkm min_zoom                              geometry
#>        <dbl> <chr>          <dbl>    <dbl>                         <POLYGON [°]>
#>  1         3 Urban area    1003.       3.7 ((-121.3788 38.39169, -121.3788 38.3…
#>  2         5 Urban area     165.       5   ((-122.8139 38.506, -122.8139 38.506…
#>  3         5 Urban area      90.6      5   ((-122.1707 38.08574, -122.1707 38.0…
#>  4         2 Urban area    2538.       3.6 ((-122.4463 37.57833, -122.4463 37.5…
#>  5         5 Urban area     514.       5   ((-121.2264 37.88368, -121.2264 37.8…
#>  6         5 Urban area     131.       5   ((-122.5233 38.02747, -122.5233 38.0…
#>  7         5 Urban area     258.       5   ((-121.7905 37.7324, -121.7905 37.73…
#>  8         5 Urban area     322.       5   ((-120.9724 37.75669, -120.9724 37.7…
#>  9         4 Urban area     401.       4   ((-119.7376 36.88969, -119.7376 36.8…
#> 10         6 Urban area      98.6      6   ((-121.6897 36.74055, -121.6897 36.7…
#> # … with 2,133 more rows

Created on 2023-01-30 with reprex v2.0.2