Closed dimfalk closed 2 years ago
Below is what GDAL sees (with "GDALinfo"). It suggests that this file is not organized as a typical geospatial raster file.
describe("hd2011010000.scu")
# [1] "Driver: HDF5Image/HDF5 Dataset"
# [2] "Files: hd2011010000.scu"
# [3] "Size is 536, 536"
# [4] "Metadata:"
# [5] " dataset_DXk_what_elev_[deg]=0.5 "
# [6] " dataset_DXk_what_enddate_[YYYYMMDD]=20201101"
# [7] " dataset_DXk_what_endtime_[HHmmss]=000000"
# [8] " dataset_DXk_what_gain=1 "
# [9] " dataset_DXk_what_nodata=-999 "
#[10] " dataset_DXk_what_ori_format=DX"
#[11] " dataset_DXk_what_ori_name=HD2011010000.scu"
#[12] " dataset_DXk_what_product=COMP "
#[13] " dataset_DXk_what_quantity=RATE "
#[14] " dataset_DXk_what_rad=1 "
#[15] " dataset_DXk_where_xsize=536 "
#[16] " dataset_DXk_where_ysize=536 "
#[17] " how_a1gate=1512001264 "
#[18] " how_adjustment=1 "
#[19] " how_angles_[deg]=0.5 "
#[20] " how_doppler=4 "
#[21] " how_maxlevel_[dBZ]=65 "
#[22] " how_minlevel_[dBZ]=0 "
#[23] " how_nodes=ESS,NHB,FLD,"
#[24] " how_nodesn=3 "
#[25] " how_place=fld"
#[26] " how_resolution_[dBZ]=0.5 "
#[27] " how_wavelength_[cm]=5 "
#[28] " how_WMO=121 "
#[29] " how_zr_a=256 "
#[30] " how_zr_b=1.42 "
#[31] " what_date_[YYYYMMDD]=20201101\\"
#[32] " what_object=IMAGE"
#[33] " what_sets=1 "
#[34] " what_time_[HHmmss]=000000\\"
#[35] " what_version_[H5Vol_0.2]=SCP_BIG 5.0.11.24 "
#[36] " where_poserror="
#[37] " where_projdef=UTM_N32"
#[38] " where_range=128 "
#[39] " where_start_lat_[deg]=5943 "
#[40] " where_start_lon_[deg]=195 "
#[41] " where_UL_ipixel=195 "
#[42] " where_UL_jpixel=5943 "
#[43] " where_xscale_[m]=1000 "
#[44] " where_xsize=536 "
#[45] " where_yscale_[m]=1000 "
#[46] " where_ysize=536 "
#[47] "Corner Coordinates:"
#[48] "Upper Left ( 0.0, 0.0)"
#[49] "Lower Left ( 0.0, 536.0)"
#[50] "Upper Right ( 536.0, 0.0)"
#[51] "Lower Right ( 536.0, 536.0)"
#[52] "Center ( 268.0, 268.0)"
#[53] "Band 1 Block=64x72 Type=Float32, ColorInterp=Undefined"
#[54] " Metadata:"
#[55] " dataset_DXk_image_CLASS=IMAGE"
#[56] " dataset_DXk_image_IMAGE_BACKGROUNDINDEX=0 "
#[57] " dataset_DXk_image_IMAGE_COLORMODEL=RGB"
#[58] " dataset_DXk_image_IMAGE_MINMAXRANGE=0 0 "
#[59] " dataset_DXk_image_IMAGE_SUBCLASS=BITMAP"
#[60] " dataset_DXk_image_IMAGE_VERSION=1.2"
#[61] " dataset_DXk_image_PALETTE="
I suspected that the GDAL HDF5 driver made a wrong assumption, and thus failed when reading the file. However, the GDAL command line tool gdal_translate works, as this creates a valid GTiff
gdal_translate -of GTiff hd2011010000.scu test.tif
So that suggests that terra makes a wrong assumption. I need to investigate that.
Thank your for your fast reply! I was able to reproduce this behaviour with GDAL 3.0.4 - thanks for the hint using gdalinfo
and gdal_translate
! Let me know if you need further information on the data.
By the way, stars::read_stars("hd2011010000.scu")
fails with the same error message.
By the way, stars::read_stars("hd2011010000.scu") fails with the same error message.
I see no errors, and
library(stars)
# Loading required package: abind
# Loading required package: sf
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.0; sf_use_s2() is TRUE
(r = read_stars("hd2011010000.scu"))
# stars object with 2 dimensions and 1 attribute
# attribute(s):
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# hd2011010000.scu -999 -999 -999 -587.3865 0 15
# dimension(s):
# from to offset delta refsys point values x/y
# x 1 536 0 1 NA NA NULL [x]
# y 1 536 536 -1 NA NA NULL [y]
r[r == -999] = NA
plot(r)
gives
The file seems to lack a missing value flag, and the array is not georeferenced, as you noticed.
This is what I see with CRAN-stars:
library(stars)
#Loading required package: abind
#Loading required package: sf
#Linking to GEOS 3.9.1, GDAL 3.3.2, PROJ 7.2.1; sf_use_s2() is TRUE
(r = read_stars("hd2011010000.scu"))
#Error in CPL_read_gdal(as.character(x), as.character(options), as.character(driver), :
# read failure
#In addition: Warning messages:
#1: In CPL_read_gdal(as.character(x), as.character(options), as.character(driver), :
# GDAL Error 1: H5Dread() failed for block.
#2: In CPL_read_gdal(as.character(x), as.character(options), as.character(driver), :
# GDAL Error 1: HDF5:"C:\Users\rhijm\Downloads\hd2011010000.scu"://dataset_DXk/image, band 1: IReadBlock failed at X #offset 0, Y offset 0: H5Dread() failed for block.
On windows with this sessionInfo()
R Under development (unstable) (2022-05-14 r82360 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C LC_TIME=English_United States.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] stars_0.5-5 sf_1.0-7 abind_1.4-5
loaded via a namespace (and not attached):
[1] crayon_1.5.1 vctrs_0.4.1 cli_3.3.0 rlang_1.0.2 DBI_1.1.2 KernSmooth_2.23-20 purrr_0.3.4
[8] generics_0.1.2 assertthat_0.2.1 glue_1.6.2 lwgeom_0.2-8 e1071_1.7-9 fansi_1.0.3 grid_4.3.0
[15] classInt_0.4-3 tibble_3.1.7 ellipsis_0.3.2 lifecycle_1.0.1 compiler_4.3.0 dplyr_1.0.9 Rcpp_1.0.8.3
[22] pkgconfig_2.0.3 R6_2.5.1 class_7.3-20 tidyselect_1.1.2 utf8_1.2.2 parallel_4.3.0 pillar_1.7.0
[29] magrittr_2.0.3 tools_4.3.0 proxy_0.4-26 units_0.8-0
And with the development version as well. So perhaps this was fixed somewhere in GDAL > 3.3.2?
It works for me on Ubuntu with GDAL 3.4.0:
library(stars)
#Loading required package: abind
#Loading required package: sf
#Linking to GEOS 3.10.1, GDAL 3.4.0, PROJ 8.2.0; sf_use_s2() is TRUE
(r = read_stars("hd2011010000.scu"))
#stars object with 2 dimensions and 1 attribute
#attribute(s):
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#hd2011010000.scu -999 -999 -999 -587.3865 0 15
#dimension(s):
# from to offset delta refsys point values x/y
#x 1 536 0 1 NA NA NULL [x]
#y 1 536 536 -1 NA NA NULL [y]
r <- rast("hd2011010000.scu")
NAflag(r) <- -999
r * 1
#class : SpatRaster
#dimensions : 536, 536, 1 (nrow, ncol, nlyr)
#resolution : 0.001865672, 0.001865672 (x, y)
#extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84
#source : memory
#name : hd2011010000
#min value : 0
#max value : 15
Although on another Ubuntu with GDAL 2.2.3
it works with stars but not with terra. So there may be more to it.
It could also possibly be related to how R GDAL HDF5 is built on windows: https://github.com/OSGeo/gdal/issues/1428
So terra puts a raster without georeference on the unit square, where GDAL (and stars) put them on the rectangle from (0,0) to (rcol,nrow). Is that on purpose?
I was about to say it cannot be ruled out I messed up a bit in my last late-night-session considering stars
, but seems like you're already some steps ahead of this.. 😏
Nevertheless, just to add this as a supplement:
library(stars)
#> Lade nötiges Paket: abind
#> Lade nötiges Paket: sf
#> Linking to GEOS 3.9.1, GDAL 3.3.2, PROJ 7.2.1; sf_use_s2() is TRUE
(r <- read_stars("hd2011010000.scu"))
#> Warning in CPL_read_gdal(as.character(x), as.character(options),
#> as.character(driver), : GDAL Error 1: H5Dread() failed for block.
#> Warning in CPL_read_gdal(as.character(x), as.character(options),
#> as.character(driver), : GDAL Error 1: HDF5:"hd2011010000.scu":
#> //dataset_DXk/image, band 1: IReadBlock failed at X offset 0,
#> Y offset 0: H5Dread() failed for block.
#> Error in CPL_read_gdal(as.character(x), as.character(options), as.character(driver), : read failure
sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=German_Germany.utf8 LC_CTYPE=German_Germany.utf8
#> [3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C
#> [5] LC_TIME=German_Germany.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] stars_0.5-5 sf_1.0-7 abind_1.4-5
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.8.3 pillar_1.7.0 compiler_4.2.0 highr_0.9
#> [5] class_7.3-20 R.methodsS3_1.8.2 R.utils_2.11.0 tools_4.2.0
#> [9] digest_0.6.29 evaluate_0.15 lifecycle_1.0.1 tibble_3.1.7
#> [13] R.cache_0.15.0 pkgconfig_2.0.3 rlang_1.0.2 reprex_2.0.1
#> [17] cli_3.3.0 DBI_1.1.3 rstudioapi_0.13 parallel_4.2.0
#> [21] yaml_2.3.5 xfun_0.31 fastmap_1.1.0 e1071_1.7-11
#> [25] dplyr_1.0.9 withr_2.5.0 styler_1.7.0 stringr_1.4.0
#> [29] knitr_1.39 generics_0.1.2 fs_1.5.2 vctrs_0.4.1
#> [33] tidyselect_1.1.2 grid_4.2.0 classInt_0.4-7 glue_1.6.2
#> [37] R6_2.5.1 fansi_1.0.3 rmarkdown_2.14 purrr_0.3.4
#> [41] magrittr_2.0.3 units_0.8-0 ellipsis_0.3.2 htmltools_0.5.2
#> [45] assertthat_0.2.1 KernSmooth_2.23-20 utf8_1.2.2 proxy_0.4-27
#> [49] stringi_1.7.6 lwgeom_0.2-8 crayon_1.5.1 R.oo_1.25.0
Yes, sorry, I forgot to say that I was looking at stars from github, remotes::install_github("r-spatial/stars")
@edzer: terra assigns an extent of (0, ncol, 0, nrow) when creating a SpatRaster from a matrix (#1); but it is (0,1,0,1) when a file does not supply an extent. I cannot say that there is a strong reason for that.
To get the dev version of stars on windows I would instead do install.packages('terra', repos='https://rspatial.r-universe.dev')
--- but either way, that won't change anything in this case.
I assume that this is a GDAL bug that will go away once R-tools gets to GDAL >= 3.4.0; so I am closing the issue.
Ok, thank you very much for your efforts and all the explanations!
terra assigns an extent of (0, ncol, 0, nrow) when creating a SpatRaster from a matrix (https://github.com/rspatial/terra/issues/1); but it is (0,1,0,1) when a file does not supply an extent. I cannot say that there is a strong reason for that.
I tried providing crs
and extent
parameters before opening this issue - presumably I was hoping to be able to override the defaults of the container resp. supply missing values - but this did not really seem to change anything:
library(terra)
#> terra 1.5.34
terra::rast("hd2011010000.scu")
#> Warning: [rast] unknown extent
#> class : SpatRaster
#> dimensions : 536, 536, 1 (nrow, ncol, nlyr)
#> resolution : 0.001865672, 0.001865672 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
#> source : hd2011010000.scu
#> name : hd2011010000
terra::rast("hd2011010000.scu",
crs = sf::st_crs(25832)$proj4string)
#> Error in .local(x, ...): unused argument (crs = "+proj=utm +zone=32 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs")
terra::rast("hd2011010000.scu",
extent = c(195000, 731000, 5407000, 5943000) |> terra::ext())
#> Error in .local(x, ...): unused argument (extent = new("SpatExtent", ptr = new("Rcpp_SpatExtent", .xData = <environment>)))
I assume that this is a GDAL bug that will go away once R-tools gets to GDAL >= 3.4.0; so I am closing the issue.
Being quite a newbie here, let me ask a little naive question: How long would it take from your point of view approximately until Rtools gets to GDAL >= 3.4.0?
In the meantime, I would still be able to make use of rhdf5
- so this should not be that critical. However, since these are my first attempts to work with these containers - and as far as I understand, there are lots of flavors of hdf5 - is there some kind of a standard how spatial information attributes are be named and where to be placed in order to make the file readable for e.g. GDAL?
I'm asking because the information seems to be available per se (crs + origin + ncols/nrows + cellsize should add up to the extent) according to gdalinfo
, but is named quite unconventionally from my personal point of view:
#[9] " dataset_DXk_what_nodata=-999 "
#[37] " where_projdef=UTM_N32"
#[41] " where_UL_ipixel=195 "
#[42] " where_UL_jpixel=5943 "
#[43] " where_xscale_[m]=1000 "
#[44] " where_xsize=536 "
#[45] " where_yscale_[m]=1000 "
#[46] " where_ysize=536 "
Thanks again!
After a bit more digging, I now see that:
So, given that the file can be read on linux and OSX, but not with RTools GDAL on windows, it seems more likely an HDF5/GDAL building issue; perhaps similar to the issue discussed here: https://github.com/OSGeo/gdal/issues/1428. I am adding @jeroen who runs some of this, but I realize that he may not be able to do much with the limited information at hand.
As for the extent and crs. You can set the correct extent after you open the file, by extracting it from the metadata (I do not know how but I suppose that is documented somewhere), just like I show for the NAflag
. It won't be automatic because the way it is stored does not appear to follow a more standard approach.
No idea if this is relevant but I was also able to display the data in QGIS 3.20.1-Odense using GDAL 3.3.1 without any problems - making it seem like this is not an issue being fixed automatically with GDAL >= 3.4.0.
Can you test if the problem appears on both R-4.1 and R-4.2? Note that R-4.2 uses a different GDAL (from Tomas Kaliber) than R-4.1 (from rwinlib).
Yeah, sure.
library(terra)
#> Warning: Paket 'terra' wurde unter R Version 4.1.3 erstellt
#> terra 1.5.34
terra::rast("hd2011010000.scu")
#> Warning: [rast] unknown extent
#> class : SpatRaster
#> dimensions : 536, 536, 1 (nrow, ncol, nlyr)
#> resolution : 0.001865672, 0.001865672 (x, y)
#> extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
#> source : hd2011010000.scu
#> name : hd2011010000
terra::rast("hd2011010000.scu") |> plot()
#> Warning: [rast] unknown extent
#> Warning: H5Dread() failed for block. (GDAL error 1)
#> Warning: HDF5:"hd2011010000.scu":
#> //dataset_DXk/image, band 1: IReadBlock failed at X offset 0,
#> Y offset 0: H5Dread() failed for block. (GDAL error 1)
#> Error: [readValues] cannot read values
library(stars)
#> Warning: Paket 'stars' wurde unter R Version 4.1.3 erstellt
#> Lade nötiges Paket: abind
#> Warning: Paket 'abind' wurde unter R Version 4.1.1 erstellt
#> Lade nötiges Paket: sf
#> Warning: Paket 'sf' wurde unter R Version 4.1.3 erstellt
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1; sf_use_s2() is TRUE
stars::read_stars("hd2011010000.scu")
#> Warning in CPL_read_gdal(as.character(x), as.character(options),
#> as.character(driver), : GDAL Error 1: H5Dread() failed for block.
#> Warning in CPL_read_gdal(as.character(x), as.character(options),
#> as.character(driver), : GDAL Error 1: HDF5:"hd2011010000.scu"://dataset_DXk/
#> image, band 1: IReadBlock failed at X offset 0, Y offset 0: H5Dread() failed for
#> block.
#> Error in CPL_read_gdal(as.character(x), as.character(options), as.character(driver), : read failure
sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 17763)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
#> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
#> [5] LC_TIME=German_Germany.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] stars_0.5-5 sf_1.0-7 abind_1.4-5 terra_1.5-34
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.8.3 compiler_4.1.0 pillar_1.7.0 highr_0.9
#> [5] class_7.3-20 tools_4.1.0 digest_0.6.29 tibble_3.1.7
#> [9] evaluate_0.15 lifecycle_1.0.1 pkgconfig_2.0.3 rlang_1.0.2
#> [13] reprex_2.0.1 DBI_1.1.3 cli_3.3.0 rstudioapi_0.13
#> [17] parallel_4.1.0 yaml_2.3.5 xfun_0.31 fastmap_1.1.0
#> [21] e1071_1.7-11 withr_2.5.0 stringr_1.4.0 dplyr_1.0.9
#> [25] knitr_1.39 generics_0.1.2 fs_1.5.2 vctrs_0.4.1
#> [29] tidyselect_1.1.2 classInt_0.4-7 grid_4.1.0 glue_1.6.2
#> [33] R6_2.5.1 fansi_1.0.3 rmarkdown_2.14 purrr_0.3.4
#> [37] magrittr_2.0.3 codetools_0.2-18 htmltools_0.5.2 ellipsis_0.3.2
#> [41] units_0.8-0 utf8_1.2.2 KernSmooth_2.23-20 stringi_1.7.6
#> [45] proxy_0.4-27 lwgeom_0.2-8 crayon_1.5.1
I also see this in R 4.1 and 4.3 --- both on windows
# R version 4.1.3 (2022-03-10) -- "One Push-Up"
library(terra)
#terra 1.5.48
gdal()
#[1] "3.4.1"
r = rast("hd2011010000.scu")
Warning message:
#[rast] unknown extent
r * 1
#Error: [*] cannot read from hd2011010000.scu
#R Under development (unstable) (2022-06-26 r82524 ucrt)
library(terra)
#terra 1.5.48
gdal()
#[1] "3.4.3"
r = rast("hd2011010000.scu")
#Warning message:
#[rast] unknown extent
r * 1
#Error: [*] cannot read from hd2011010000.scu
Guess I am not contributing anything significantly new to this discussion but since I found some time to mess around with Linux, I can at least confirm that reading the data works on Ubuntu 20.04 with R 4.1.3 as well as R 4.2.1 using both, terra
and stars
, in contrast to the behaviour on windows (s. above).
# R version 4.1.3 (2022-03-10) -- "One Push-Up"
library(terra)
#terra 1.5.34
gdal()
#[1] "3.0.4"
r = rast("hd2011010000.scu")
Warning message:
#[rast] unknown extent
r * 1
#class : SpatRaster
#dimensions : 536, 536, 1 (nrow, ncol, nlyr)
#resolution : 0.001865672, 0.001865672 (x, y)
#extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84
#source : memory
#name : hd2011010000
#min value : -999
#max value : 15
# R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
library(terra)
#terra 1.5.34
gdal()
#[1] "3.0.4"
r = rast("hd2011010000.scu")
Warning message:
#[rast] unknown extent
r * 1
#class : SpatRaster
#dimensions : 536, 536, 1 (nrow, ncol, nlyr)
#resolution : 0.001865672, 0.001865672 (x, y)
#extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84
#source : memory
#name : hd2011010000
#min value : -999
#max value : 15
Since I feel like this is OS-related and not an issue with terra
(or stars
) per se, I do not know if this is really the right place to investigate (is it e.g. possible to move issues to another repo?)... However, if there is a way I might help, please let me know, @jeroen.
Thank you very much in advance!
Perhaps @roualt can comment?
followup from https://github.com/r-spatial/sf/issues/1995#issuecomment-1250199092
Windows HDF5 version seems to be 1.12.0 https://svn.r-project.org/R-dev-web/trunk/WindowsBuilds/winutf8/ucrt3/toolchain_libs/mxe/src/hdf5.mk, confirmed in Rtools42 include/H5pubconf.h.
Fedora 36, HDF5 1.12.1, GDAL 3.5.2:
> library(terra)
terra 1.6.18
> x <- rast("hd2011010000.scu")
Warning message:
[rast] unknown extent
> x
class : SpatRaster
dimensions : 536, 536, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 536, 0, 536 (xmin, xmax, ymin, ymax)
coord. ref. :
source : hd2011010000.scu
name : hd2011010000
> y <- x + 1
> y
class : SpatRaster
dimensions : 536, 536, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 536, 0, 536 (xmin, xmax, ymin, ymax)
coord. ref. :
source : memory
name : hd2011010000
min value : -998
max value : 16
Windows 10, my binary build of terra from the sf issue:
> library(terra)
terra 1.6.18
> gdal()
[1] "3.5.0"
> x <- rast("hd2011010000.scu")
Warning message:
[rast] unknown extent
> x
class : SpatRaster
dimensions : 536, 536, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 536, 0, 536 (xmin, xmax, ymin, ymax)
coord. ref. :
source : hd2011010000.scu
name : hd2011010000
> y <- x * 1
|---------|---------|---------|---------|
=========================================
Warning messages:
1: H5Dread() failed for block. (GDAL error 1)
2: HDF5:"C:/Users/RB/work/hd2011010000.scu"://dataset_DXk/image, band 1: IReadBlock failed at X offset 0, Y offset 0: H5Dread() failed for block. (GDAL error 1)
>
> y
class : SpatRaster
dimensions : 536, 536, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 536, 0, 536 (xmin, xmax, ymin, ymax)
coord. ref. :
>
No ideas from me, never use HDF at all; if HDF5 is not UCRT-ready, or uses std::regex
, it might fail if the problem only occurred R >= 4.2, but is reported above in R 4.1 on Windows.
For completeness, R 4.2.1, HDF5 1.12.0 macOS arm64:
> library(terra)
terra 1.6.7
> gdal()
[1] "3.4.2"
> x <- rast("hd2011010000.scu")
Warning message:
[rast] unknown extent
> #Warning message:
> #[rast] unknown extent
> y <- x * 1
>
> y
class : SpatRaster
dimensions : 536, 536, 1 (nrow, ncol, nlyr)
resolution : 0.001865672, 0.001865672 (x, y)
extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
coord. ref. :
source : memory
name : hd2011010000
min value : -999
max value : 15
Windows OSGeo4W:
>gdalinfo hd2011010000.scu
Driver: netCDF/Network Common Data Format
Files: hd2011010000.scu
Size is 536, 536
Metadata:
/dataset_DXk/image#IMAGE_BACKGROUNDINDEX=0
/dataset_DXk/image#IMAGE_COLORMODEL=RGB
/dataset_DXk/image#IMAGE_MINMAXRANGE={0,0}
/dataset_DXk/image#IMAGE_SUBCLASS=BITMAP
/dataset_DXk/image#IMAGE_VERSION=1.2
/dataset_DXk/what/NC_GLOBAL#elev [deg]=0.5
/dataset_DXk/what/NC_GLOBAL#enddate [YYYYMMDD]=20201101
/dataset_DXk/what/NC_GLOBAL#endtime [HHmmss]=000000
/dataset_DXk/what/NC_GLOBAL#gain=1
/dataset_DXk/what/NC_GLOBAL#nodata=-999
/dataset_DXk/what/NC_GLOBAL#ori_format=DX
/dataset_DXk/what/NC_GLOBAL#ori_name=HD2011010000.scu
/dataset_DXk/what/NC_GLOBAL#product=COMP
/dataset_DXk/what/NC_GLOBAL#quantity=RATE
/dataset_DXk/what/NC_GLOBAL#rad=1
/dataset_DXk/where/NC_GLOBAL#xsize=536
/dataset_DXk/where/NC_GLOBAL#ysize=536
/how/NC_GLOBAL#a1gate=1512001264
/how/NC_GLOBAL#adjustment=1
/how/NC_GLOBAL#angles [deg]=0.5
/how/NC_GLOBAL#doppler=4
/how/NC_GLOBAL#maxlevel [dBZ]=65
/how/NC_GLOBAL#minlevel [dBZ]=0
/how/NC_GLOBAL#nodes=ESS,NHB,FLD,
/how/NC_GLOBAL#nodesn=3
/how/NC_GLOBAL#place=fld
/how/NC_GLOBAL#resolution [dBZ]=0.5
/how/NC_GLOBAL#wavelength [cm]=5
/how/NC_GLOBAL#WMO=121
/how/NC_GLOBAL#zr_a=256
/how/NC_GLOBAL#zr_b=1.42
/what/NC_GLOBAL#date [YYYYMMDD]=20201101\
/what/NC_GLOBAL#object=IMAGE
/what/NC_GLOBAL#sets=1
/what/NC_GLOBAL#time [HHmmss]=000000\
/what/NC_GLOBAL#version [H5Vol 0.2]=SCP_BIG 5.0.11.24
/where/NC_GLOBAL#poserror=
/where/NC_GLOBAL#projdef=UTM_N32
/where/NC_GLOBAL#range=128
/where/NC_GLOBAL#start_lat [deg]=5943
/where/NC_GLOBAL#start_lon [deg]=195
/where/NC_GLOBAL#UL_ipixel=195
/where/NC_GLOBAL#UL_jpixel=5943
/where/NC_GLOBAL#xscale [m]=1000
/where/NC_GLOBAL#xsize=536
/where/NC_GLOBAL#yscale [m]=1000
/where/NC_GLOBAL#ysize=536
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 536.0)
Upper Right ( 536.0, 0.0)
Lower Right ( 536.0, 536.0)
Center ( 268.0, 268.0)
Band 1 Block=64x72 Type=Float32, ColorInterp=Undefined
Metadata:
IMAGE_BACKGROUNDINDEX=0
IMAGE_COLORMODEL=RGB
IMAGE_MINMAXRANGE={0,0}
IMAGE_SUBCLASS=BITMAP
IMAGE_VERSION=1.2
NETCDF_VARNAME=image
There are 2 HDF5 objects open!
>gdal_translate -of GTiff hd2011010000.scu hd2011010000.tif
Input file size is 536, 536
0...10...20...30...40...50...60...70...80...90...100 - done.
There are 2 HDF5 objects open!
Report: open objects on 72057594037927936
Type = File(72057594037927936) name='/'Type = Attribute(432345564227567616) name='CLASS'ERROR 1: netcdf error #-101 : NetCDF: HDF error .
at (D:\src\osgeo4w\src\gdal\gdal-3.5.1\frmts\netcdf\netcdfdataset.cpp,netCDFDataset::~netCDFDataset,2847)
Report: open objects on 72057594037927936
Type = File(72057594037927936) name='/'Type = Attribute(432345564227567616) name='CLASS'ERROR 1: netcdf error #-101 : NetCDF: HDF error .
at (D:\src\osgeo4w\src\gdal\gdal-3.5.1\frmts\netcdf\netcdfdataset.cpp,netCDFDataset::~netCDFDataset,2847)
The file converted with OSGeo4W's gdal_translate
may or may not be complete, gdalinfo
is showing errors of some kind. This is not the GDAL build used by Rtools42.
> library(terra)
terra 1.6.18
> x <- rast("hd2011010000.tif")
Warning message:
[rast] unknown extent
> y <- x * 1
|---------|---------|---------|---------|
=========================================
> y
class : SpatRaster
dimensions : 536, 536, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 536, 0, 536 (xmin, xmax, ymin, ymax)
coord. ref. :
source : memory
name : hd2011010000
min value : -999
max value : 15
>
It appears that the cause of the problem is that the windows HDF5 library was built without the "szip" library. That library is needed to read this file (presumably because it was compressed using szip). See https://github.com/r-spatial/sf/issues/1995#issuecomment-1250364222 by @kalibera.
@rhijmans I think that the problem is that szip has had an unfortunate license: https://support.hdfgroup.org/doc_resource/SZIP/. I'm unsure whether the HDF5 driver should expect to write to a file created with szip. It is an open question whether packagers of hdf5 should build with szip read/write or just default read. Could some check what for example Debian says about szip?
Homebrew has replaced szip with libaec, see: https://gitlab.dkrz.de/k202009/libaec/-/blob/master/README.SZIP
Fedora also distributes libaec
, and HDF5 is built using it, so that explains being able to write to Szip-encoded files. MXE has aec, and the hdf5 MXE build now uses it. Untested with the latest MXE development versions.
With current library binaries in https://www.r-project.org/nosvn/winutf8/ucrt3/gcc10_ucrt3_full_5349.tar.zst, copied into Rtools42, locally manually patched terra/src/Makevars.ucrt
(https://www.r-project.org/nosvn/winutf8/ucrt3/patches/CRAN/terra.diff), and r-devel
5349 :
> library(terra)
terra 1.6.21
> gdal()
[1] "3.5.2"
> x <- rast("hd2011010000.scu")
Warning message:
[rast] unknown extent
> y <- x * 1
> y
class : SpatRaster
dimensions : 536, 536, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 536, 0, 536 (xmin, xmax, ymin, ymax)
coord. ref. :
source : memory
name : hd2011010000
min value : -999
max value : 15
>
So when the next release of Rtools occurs, libael
will be used when building HDF5. Thanks to @kalibera for adding AEC to the Rtools42 MXE build tree.
Thank you very much @kalibera, @jeroen and @rsbivand!
Thank you very much to everyone for digging into this!
Hello! I'm facing some problems with reading HDF5 files using
terra
at the moment but, to be honest, I have to mention I'm not completely sure whether this is the right place to ask for help - ergo, if this is really terra-related - or just a result of my lack of understanding of the HDF5 format or terra functionality respectively.However, I'll give it a try and if this is not the right place, please tell me.
I'm working with precipitation raster data stored in HDF5 containers (data sample: hd2011010000.zip).
Created on 2022-06-23 by the reprex package (v2.0.1)
Reading seems fine at first sight until I attempt to
plot()
the resulting SpatRaster object or perform some basic calculations (like a division by a certain factor to adjust units):Although
*.scu
does not seem to be a common file extension for the HDF5 format, hdf5-tools seems to be able to read this, c.f. output from h5debug and h5dump.Moreover, the file is recognized using HDFView 3.14 in terms of the content being plotable. Making use of Panoply 5.1.0 the
image
layer is not recognized but this could also be a result of the need to change the file extension to*.hdf5
in order to be recognized by the software, no idea tbh. Maybe the structure of the container is not quite up to standards so that reading is failing with some products?Before trying to migrate from
raster
toterra
I was making use of rhdf5 to read these files but I'd like to reduce further dependencies if possible, hence my attempt to solve this usingterra
only.Created on 2022-06-23 by the reprex package (v2.0.1)
I'd really like to understand why reading fails here. Is there something I can do to be able to read these files using
terra
?Thank you very much in advance!