mapme-initiative / mapme.biodiversity

Efficient analysis of spatial biodiversity datasets for global portfolios
https://mapme-initiative.github.io/mapme.biodiversity/dev
GNU General Public License v3.0
33 stars 7 forks source link

ReadBlock errors when trying to fetch nelson_et_al resource #308

Closed goergen95 closed 4 months ago

goergen95 commented 4 months ago

Hi, I have the same error message when running get_nelson_et_al and calc_traveltime. Here is the error message I get when running this code :

 library(sf)
 library(mapme.biodiversity)
 outdir <- file.path(tempdir(), "mapme-data")
dir.create(outdir, showWarnings = FALSE)

 mapme_options(
     outdir = outdir,
    verbose = FALSE
 )

 aoi <- system.file("extdata", "sierra_de_neiba_478140_2.gpkg",
                    package = "mapme.biodiversity"
 ) %>%
   read_sf() %>%
   get_resources(get_nelson_et_al()) 
Error in sf::gdal_utils(util = util, source = source, destination = destination,  : 
  gdal_utils translate: an error occured
Warning messages:
1: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 1: TIFFFillStrip:Read error at scanline 9930; got 0 bytes, expected 19850
2: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 1: TIFFReadEncodedStrip() failed.
3: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 1: /vsicurl/https://figshare.com/ndownloader/files/14189831, band 1: IReadBlock failed at X offset 0, Y offset 9931: TIFFReadEncodedStrip() failed.

Originally posted by @duboisl-afd in https://github.com/mapme-initiative/mapme.biodiversity/issues/302#issuecomment-2243200473

goergen95 commented 4 months ago

Thanks for reporting the issue with this resource. Indeed, I can reproduce this 50% of the time, but the other half it seems to run successfully...

Let's try to be explicit here and specify which traveltime range we are requesting. With the following code, which should request the same data as in your example

library(sf)
library(mapme.biodiversity)
outdir <- file.path(tempdir(), "mapme-data")
dir.create(outdir, showWarnings = FALSE)

mapme_options(
  outdir = outdir,
  verbose = FALSE
)

aoi <- system.file("extdata", "sierra_de_neiba_478140_2.gpkg",
                   package = "mapme.biodiversity"
) %>%
  read_sf() %>%
  get_resources(get_nelson_et_al(ranges = "20k_50k")) 

I get:

Error in sf::gdal_utils(util = util, source = source, destination = destination,  : 
  gdal_utils translate: an error occured
Warning messages:
1: TIFFFillStrip:Read error at scanline 10820; got 0 bytes, expected 20710 (GDAL error 1) 
2: TIFFReadEncodedStrip() failed. (GDAL error 1) 
3: 14189831, band 1: IReadBlock failed at X offset 0, Y offset 10821: TIFFReadEncodedStrip() failed. (GDAL error 1) 

So the error occurs at a different position. I also get different positions when I re-run the same code multiple times. Let's check if the file is actually corrupted when we download it first.

fps <- mapme.biodiversity:::.get_traveltime_url(range = "20k_50k", filenames = file.path(outdir, "traveltime_20k_50k.tif"))
print(fps$source)
download.file(gsub("/vsicurl/", "", fps$source), fps$filename)
system(sprintf("gdalinfo -checksum %s", fps$filename))
[1] "/vsicurl/https://figshare.com/ndownloader/files/14189831"

Driver: GTiff/GeoTIFF
Files: /tmp/RtmpU3C7kl/mapme-data/traveltime_20k_50k.tif
Size is 43200, 17400
Coordinate System is:
GEOGCRS["WGS 84",
    ENSEMBLE["World Geodetic System 1984 ensemble",
        MEMBER["World Geodetic System 1984 (Transit)"],
        MEMBER["World Geodetic System 1984 (G730)"],
        MEMBER["World Geodetic System 1984 (G873)"],
        MEMBER["World Geodetic System 1984 (G1150)"],
        MEMBER["World Geodetic System 1984 (G1674)"],
        MEMBER["World Geodetic System 1984 (G1762)"],
        MEMBER["World Geodetic System 1984 (G2139)"],
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]],
        ENSEMBLEACCURACY[2.0]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    USAGE[
        SCOPE["Horizontal component of 3D system."],
        AREA["World."],
        BBOX[-90,-180,90,180]],
    ID["EPSG",4326]]
Data axis to CRS axis mapping: 2,1
Origin = (-180.000000000000000,85.000000000000000)
Pixel Size = (0.008333333333000,-0.008333333333000)
Metadata:
  DataType=Generic
  AREA_OR_POINT=Area
Image Structure Metadata:
  COMPRESSION=LZW
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (-180.0000000,  85.0000000) (180d 0' 0.00"W, 85d 0' 0.00"N)
Lower Left  (-180.0000000, -60.0000000) (180d 0' 0.00"W, 60d 0' 0.00"S)
Upper Right ( 180.0000000,  85.0000000) (180d 0' 0.00"E, 85d 0' 0.00"N)
Lower Right ( 180.0000000, -60.0000000) (180d 0' 0.00"E, 60d 0' 0.00"S)
Center      (  -0.0000000,  12.5000000) (  0d 0' 0.00"W, 12d30' 0.00"N)
Band 1 Block=43200x1 Type=UInt16, ColorInterp=Gray
  Checksum=43352
  NoData Value=-1
  Metadata:
    RepresentationType=THEMATIC

So, the file checks out ok without any block errors. However, when trouble shooting this I get the following error message after some tries of accessing the data:

Warning 1: HTTP response code on https://figshare.com/ndownloader/files/14189831.xml: 429
ERROR 11: HTTP response code: 429

So, my current best guess is that the source server cuts the connection while we are fetching the data which is not properly handled by GDAL. It is, however, hard to reproduce, so I am hesitant to open an issue with GDAL, yet.

A mitigation to be able to download the data eventually is to play with GDAL's networking options. Specifically I was able to download the data after setting GDAL_HTTP_MAX_RETRY and GDAL_HTTP_RETRY_DELAY (documented here).

I will try to investigate further, but I am not sure if I find a general solution to this issue.

goergen95 commented 4 months ago

FYI, @duboisl-afd

goergen95 commented 4 months ago

Could yo please try with setting the following options:

Sys.setenv(
  "VSI_CACHE" = "TRUE",
  "CPL_VSIL_CURL_CHUNK_SIZE" = "10485760",
  "GDAL_HTTP_MAX_RETRY" = "5",
  "GDAL_HTTP_RETRY_DELAY" = "15"
)
karpfen commented 4 months ago

Hi @goergen95, FYI I tried reproducing the error as well. When downloading 20k_50k resource, I get this error:

Error in sf::gdal_utils(util = util, source = source, destination = destination,  : 
  gdal_utils translate: an error occured
Warning messages:
1: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 1: TIFFFillStrip:Read error at scanline 2299; got 0 bytes, expected 73953
2: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 1: TIFFReadEncodedStrip() failed.
3: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 1: /vsicurl/https://figshare.com/ndownloader/files/14189831, band 1: IReadBlock failed at X offset 0, Y offset 2300: TIFFReadEncodedStrip() failed.

After setting the environment variables you posted above, it went through fine (not sure how to check this properly, but running calc_traveltime() afterwards returned a result).

duboisl-afd commented 4 months ago

After setting the environment variables it's working for me as well ! Thanks :)

goergen95 commented 4 months ago

Thanks!