Closed jonocarroll closed 4 years ago
Thanks for the report. I've set aside the next couple of days to look at the outstanding rhdf5 issues, hopefully I'll get round to addressing this by the end of the week.
This message was intended to warn someone that had created an HDF5 file outside R that any instance of the "smallest integer" had been replaced by NA
in the resulting R object. It looks like for 64-bit integers there's a typo and it should read "integer value -2^63 replaced by NA" - which might have made the intention a little clearer.
The side effect to this is that any R object that contains NA
and is written the HDF5 will then trigger the warning when read back, because the original NA
values will be stored as "smallest int" in the file. I guess this probably happens at least as frequently as someone reading a file generated outside R.
The warning is annoying, but if you're writing and reading things contains NA
with rhdf5 they should be preserved despite the message.
I propose to add an attribute to anything written with rhdf5 containing NA
values and use this to ignore the warning. Then it should only show up for someone encountering the original usecase.
As of rhdf5 v. 2.33.3 you shouldn't get this warning if the original file was created with rhdf5.
library(rhdf5)
m <- matrix(c(0L, 1L, NA_integer_, 0L, 1L, NA_integer_), nrow=2)
file <- tempfile(fileext = '.h5')
h5write(m, file, "M1")
h5read(file, "M1")
#> [,1] [,2] [,3]
#> [1,] 0 NA 1
#> [2,] 1 0 NA
For a dataset not generated with rhdf5 the information is still printed, but downgraded to a message since there's nothing a user can do about R using those values to represent NA.
## This code removes the 'rhdf5-NA.OK' attribute to simulate data not written by rhdf5
fid <- H5Fopen(name = h5File)
did <- H5Dopen(fid, name = "M1")
H5Adelete(did, "rhdf5-NA.OK")
H5Dclose(did)
H5Fclose(fid)
h5read(file, "M1")
#> The value -2^31 was detected in the dataset.
#> This has been converted to NA within R.
#> [,1] [,2] [,3]
#> [1,] 0 NA 1
#> [2,] 1 0 NA
Confirmed resolved in 2.33.7 - thank you!!!
I'd like to clarify the warning I get when storing NAinteger ...
(rhdf5 2.30.1)
I see #58 has this (the issue there is more severe) and another discussion in #42 but is this still an expected warning? I spent quite a while trying to figure out why I apparently had large negative ints in my input data when I really only had some
NA
.Is it possible to identify when
NA
is being used and avoid this warning? I couldn't immediately find where this was documented (if it is). The aforementioned section does not seem to appear in this vignette https://bioconductor.org/packages/release/bioc/vignettes/rhdf5/inst/doc/rhdf5.html