Closed hpages closed 4 years ago
Mike, any thought on this? Thanks!
Sorry, been a bit swamped with other things coming up to release time. I'll bump this up the to-do list
This is actually a little trickier than I expect, as the 8bit H5T_STD_I8LE
gets read into a RAWSXP
and even -128 gets converted to 0 at that point. I'll think about a strategy for handling this more tomorrow.
This should be fixed in version 2.33.3:
library(rhdf5)
m <- matrix(c(FALSE, TRUE, NA, FALSE, TRUE, NA), nrow=2)
file <- tempfile(fileext = '.h5')
h5write(m, file, "M1")
h5ls(file, all=TRUE)
#> group name ltype corder_valid corder cset otype num_attrs
#> 0 / M1 H5L_TYPE_HARD FALSE 0 0 H5I_DATASET 1
#> dclass dtype stype rank dim maxdim
#> 0 INTEGER H5T_STD_I8LE SIMPLE 2 2 x 3 2 x 3
h5read(file, "M1")
#> [,1] [,2] [,3]
#> [1,] FALSE NA TRUE
#> [2,] TRUE FALSE NA
It no longer prints the warning about converting a number to NA
. For logicals you can only end up in the -128 → NA
mapping if there's also an attribute with storage.mode == logical
so I feel comfortable this isn't going to catch out many users that didn't create the original file in R.
This is a bug, so I'll give it a few days to make sure nothing funky happens in other packages, and then also patch the release branch.
Yep, works as expected now.
Not sure why you have to read into a RAWSXP
first. Wouldn't it be possible to alloc a LGLSXP
on detection of the storage.mode == logical
attribute and read directly into that?
@grimbough I was wondering if you have any plans to port the "NA handling" stuff to the RELEASE_3_11 branch. In particular, the RELEASE_3_11 branch still uses H5T_STD_U8LE
to store logical data so the issue of NA
s getting stored as FALSE
s remains there. Thanks!
Hi @hpages I think I've pushed the required changes to RELEASE_3_11 (rhdf5 version 2.32.2).
Great. Thanks Mike!
Yep, I can confirm that rhdf5 2.32.2 addresses the original issue. Thanks!
Hi Mike,
This was originally reported here.
In release (rhdf5 2.30.1):
In devel (rhdf5 2.31.6):
I see that in devel you're now using H5 type
H5T_STD_U8LE
instead ofH5T_STD_I32LE
to store logical data (probably related to that discussion) and it makes a lot of sense to use an 8-bit type instead of a 32-bit type here. However the fact thatH5T_STD_U8LE
is an unsigned type is a problem because AFAIK the HDF5 library replaces negative values in the input (theNA
s) with zeroes to make them "fit" in the unsigned type. This causesNA
s to be stored asFALSE
s.I believe that if
H5T_STD_I8LE
was used instead ofH5T_STD_U8LE
then the HDF5 library would replaceNA
s with the most negative value in theH5T_STD_I8LE
range (-128 ?) so I guess that would work.h5read()
would still need to make sure to map-128
values back toINT_MIN
(the value R uses to represent integer and logicalNA
s).Thanks, H.