rOpenGov / giscoR

Download geospatial data from GISCO API - Eurostat
https://ropengov.github.io/giscoR/
GNU General Public License v3.0
71 stars 1 forks source link

URL not reachable (GISCO API) #57

Closed martinhulenyi closed 1 year ago

martinhulenyi commented 1 year ago

Since Friday getting nuts2 geodata does not work for me, everytime i try run nuts <- gisco_get_nuts(year = 2021, nuts_level = "3", resolution = "01", cache = TRUE, update_cache = TRUE) I get an error https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_01M_2021_4326_LEVL_3.geojson not reachable.

But when I type the url in a browser, I get to the GeoJson file.

dieghernan commented 1 year ago

Hi @martinhulenyi

Thanks for reporting. I am not sure of the cause of the issue. I think that GISCO has changed something so now the download of data is not working as expected any more. Not sure how to fix that at the moment.

For testing, note that I tried to download the file by using different R methods (not giscoR, i.e. base R, curl and httr) with no luck, so as I said it is not on the package itself but in the GISCO side.

I'll track this in the next week, it is not the first time that this kind of things happens with GISCO and after a week o so they fix it.

See (failing) example:

url <- "https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_01M_2021_4326_LEVL_3.geojson"
temp_geosjson <- tempfile(fileext = ".geojson")

# Base R
download.file(url, temp_geosjson)
#> Warning in download.file(url, temp_geosjson): downloaded length 0 != reported
#> length 0
#> Warning in download.file(url, temp_geosjson): URL
#> 'https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_01M_2021_4326_LEVL_3.geojson':
#> status was 'Transferred a partial file'
#> Error in download.file(url, temp_geosjson): download from 'https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_01M_2021_4326_LEVL_3.geojson' failed

# {curl}
curl::curl_download(url, temp_geosjson)
#> Error in curl::curl_download(url, temp_geosjson): transfer closed with 15853199 bytes remaining to read

# {httr}
response <- httr::GET(url)
#> Error in curl::curl_fetch_memory(url, handle = handle): transfer closed with 15615326 bytes remaining to read
martinhulenyi commented 1 year ago

Thank you very much :)

martinhulenyi commented 1 year ago

By the way I tried several things and interestingly the command works if I run nuts <- gisco_get_nuts(year = 2016) but not if further specify the command above or if I try the same line with a different version of nuts (2021 or 2010).

dieghernan commented 1 year ago

Trying to contact GISCO https://twitter.com/dhernangomez/status/1669430941004058643?s=46&t=taYTjopuei1kln-JzvyfSw

After some research it can be that the Content-Lenght declared by the API (as the size of the file) and the actual size differs. Then curl would understand that the file has been transferred partially or even stop before the full download.

Related:

CURLE_PARTIAL_FILE (18)

A file transfer was shorter or larger than expected. This happens when the server first reports an expected transfer size, and then delivers data that doesn't match the previously given size.

dieghernan commented 1 year ago

By the way I tried several things and interestingly the command works if I run nuts <- gisco_get_nuts(year = 2016) but not if further specify the command above or if I try the same line with a different version of nuts (2021 or 2010).

Yep, for 2016 it uses a local copy included in the package, unless cache = FALSE of update_cache = TRUE

https://github.com/rOpenGov/giscoR/blob/c9b63ca73687af889482401067061f8fef6a6c2c/R/gisco_get_nuts.R#L108-L121

dieghernan commented 1 year ago

Ping @jgaffuri and @joewdavies (apologies if I am mistaken) as they seem to be very active on the GISCO/Geo projects of @eurostat

hannesaddec commented 1 year ago

we will investigate

hannesaddec commented 1 year ago

We fixed some issues - let us know if giscoR works again for you. curl call above is still not stable, under further investigation.

dieghernan commented 1 year ago

Hi @hannesaddec , thanks but still erroring. Some users also suggests that increasing the timeout of the GET request to your API may be of any help (just in case this information is of interest)

dieghernan commented 1 year ago

Hi @hannesaddec

I re-checked and I can see now that the previous broken example is working now (cc @dominicroye). As per the package checks that are quite extensive the only url not working right now is the corresponding to Postal Codes.

See new run:

url <- "https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_01M_2021_4326_LEVL_3.geojson"
temp_geosjson <- tempfile(fileext = ".geojson")

# Base R
download.file(url, temp_geosjson)

head(sf::read_sf(temp_geosjson))
#> Simple feature collection with 6 features and 10 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -9.082954 ymin: 45.90159 xmax: 33.28728 ymax: 80.83402
#> Geodetic CRS:  WGS 84
#> # A tibble: 6 × 11
#>   id    NUTS_ID LEVL_CODE CNTR_CODE NAME_LATN     NUTS_NAME MOUNT_TYPE URBN_TYPE
#>   <chr> <chr>       <int> <chr>     <chr>         <chr>          <int>     <int>
#> 1 NO0B2 NO0B2           3 NO        Svalbard      Svalbard           3         3
#> 2 NO0B1 NO0B1           3 NO        Jan Mayen     Jan Mayen          3         3
#> 3 HR064 HR064           3 HR        Krapinsko-za… Krapinsk…          4         3
#> 4 DE21A DE21A           3 DE        Erding        Erding             4         3
#> 5 DE94E DE94E           3 DE        Osnabrück, L… Osnabrüc…          4         2
#> 6 DE94F DE94F           3 DE        Vechta        Vechta             4         2
#> # ℹ 3 more variables: COAST_TYPE <int>, FID <chr>, geometry <MULTIPOLYGON [°]>

# {curl}
# Delete previous temp file
unlink(temp_geosjson)
curl::curl_download(url, temp_geosjson)

head(sf::read_sf(temp_geosjson))
#> Simple feature collection with 6 features and 10 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -9.082954 ymin: 45.90159 xmax: 33.28728 ymax: 80.83402
#> Geodetic CRS:  WGS 84
#> # A tibble: 6 × 11
#>   id    NUTS_ID LEVL_CODE CNTR_CODE NAME_LATN     NUTS_NAME MOUNT_TYPE URBN_TYPE
#>   <chr> <chr>       <int> <chr>     <chr>         <chr>          <int>     <int>
#> 1 NO0B2 NO0B2           3 NO        Svalbard      Svalbard           3         3
#> 2 NO0B1 NO0B1           3 NO        Jan Mayen     Jan Mayen          3         3
#> 3 HR064 HR064           3 HR        Krapinsko-za… Krapinsk…          4         3
#> 4 DE21A DE21A           3 DE        Erding        Erding             4         3
#> 5 DE94E DE94E           3 DE        Osnabrück, L… Osnabrüc…          4         2
#> 6 DE94F DE94F           3 DE        Vechta        Vechta             4         2
#> # ℹ 3 more variables: COAST_TYPE <int>, FID <chr>, geometry <MULTIPOLYGON [°]>

# {httr}
response <- httr::GET(url)

as_text <- httr::content(response, as ="text")
#> No encoding supplied: defaulting to UTF-8.
writeLines(as_text, "test.geojson")

head(sf::read_sf("test.geojson"))
#> Simple feature collection with 6 features and 10 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -9.082954 ymin: 45.90159 xmax: 33.28728 ymax: 80.83402
#> Geodetic CRS:  WGS 84
#> # A tibble: 6 × 11
#>   id    NUTS_ID LEVL_CODE CNTR_CODE NAME_LATN     NUTS_NAME MOUNT_TYPE URBN_TYPE
#>   <chr> <chr>       <int> <chr>     <chr>         <chr>          <int>     <int>
#> 1 NO0B2 NO0B2           3 NO        Svalbard      Svalbard           3         3
#> 2 NO0B1 NO0B1           3 NO        Jan Mayen     Jan Mayen          3         3
#> 3 HR064 HR064           3 HR        Krapinsko-za… Krapinsk…          4         3
#> 4 DE21A DE21A           3 DE        Erding        Erding             4         3
#> 5 DE94E DE94E           3 DE        Osnabrück, L… Osnabrüc…          4         2
#> 6 DE94F DE94F           3 DE        Vechta        Vechta             4         2
#> # ℹ 3 more variables: COAST_TYPE <int>, FID <chr>, geometry <MULTIPOLYGON [°]>

Created on 2023-06-27 with reprex v2.0.2

hannesaddec commented 1 year ago

can you test again =- postal codes should be fixed now as well.

dieghernan commented 1 year ago

Hi @hannesaddec

It seems to be all fixed see https://github.com/rOpenGov/giscoR/actions/runs/5391769159

Many thanks. As a final check, @martinhulenyi can you confirm as well?

martinhulenyi commented 1 year ago

Yes, works fine for me :) Thank you for the correction. Thank you also for creating and maintaining this package, I really enjoy using it.

dieghernan commented 1 year ago

Closed, thanks @hannesaddec & team