Open LTLA opened 5 years ago
If nothing else, it's not a helpful error message. Why's a chunk size check failing here? I'll take a look at what's happening and try to come up with a solution for NA strings.
The error was being thrown because it was trying to find the length of the longest string, which was returning NA
and then using that to determine the chunk size. That's now fixed.
It now writes an NA_character_
to file as a literal "NA" and sets an attribute if this has occurred. It should then coerce them back to NA_character_
if it's read with h5read()
.
library(rhdf5)
h5createFile("whee.h5")
h5write(c(NA_character_, LETTERS), "whee.h5", "more_stuff")
h5read("whee.h5", "more_stuff")
[1] NA "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
If you happen to write "NA" then that will be preserved, but this will cause an issue if you try h5write(c(NA_character_, "NA"))
as they'll both be be converted. Hopefully that's not something that occurs too often and it will throw a warning if it's detected.
Let me know if I've missed an obvious reader out. This only works in h5read()
for the moment.
Good enough for the time being, but it seems a bit fragile... better hope no one's working with NAs on neuraminidase.
I don't have a good idea on how to represent a NA string. Maybe if we add a character at the end of the fixed-len array (after the null terminator), the only purpose of which is to tell us if the rest of it is NA or not? Yeah, a bit wasteful, but it's the least of all evils.
Session information
``` R version 3.6.0 Patched (2019-05-10 r76483) Platform: x86_64-apple-darwin17.7.0 (64-bit) Running under: macOS High Sierra 10.13.6 Matrix products: default BLAS: /Users/luna/Software/R/R-3-6-branch-dev/lib/libRblas.dylib LAPACK: /Users/luna/Software/R/R-3-6-branch-dev/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rhdf5_2.29.0 loaded via a namespace (and not attached): [1] compiler_3.6.0 Rhdf5lib_1.7.4 ```