grimbough / rhdf5

Package providing an interface between HDF5 and R
http://bioconductor.org/packages/rhdf5
59 stars 22 forks source link

Replacement method issue with enumeration type #145

Open luigidolcetti opened 2 months ago

luigidolcetti commented 2 months ago

Hi, I might get something wrong but I cannot understand the behaviour of replacement with enumeration type.

So to start, this works just fine:

testFile <- file.path("E:/TESTs",'total.h5') totfile <- rhdf5::H5Fcreate(testFile) dSet <- rhdf5::H5Dcreate(h5loc = totfile, name = 'TOTAL', dtype_id = 'H5T_NATIVE_UCHAR', h5space = rhdf5::H5Screate_simple(c(3,10))) rhdf5::H5Dwrite(dSet, as.raw(matrix(c(0,1),3,10))) test <- rhdf5::H5Fopen(testFile,flags = "H5F_ACC_RDWR") testD <- rhdf5::H5Dopen(h5loc = test,name = 'TOTAL') print(testD[]) testD[1,1] <- as.raw(1) rhdf5::h5closeAll() test <- rhdf5::H5Fopen(testFile,flags = "H5F_ACC_RDONLY") testD <- rhdf5::H5Dopen(h5loc = test,name = 'TOTAL') print(testD[]) rhdf5::h5closeAll()

But, following the example regarding how to encode logical values:

testFile <- file.path("E:/TESTs",'total.h5') totfile <- rhdf5::H5Fcreate(testFile) tid <- rhdf5::H5Tenum_create(dtype_id = "H5T_NATIVE_UCHAR") rhdf5::H5Tenum_insert(tid, name = "TRUE", value = 1L) rhdf5::H5Tenum_insert(tid, name = "FALSE", value = 0L) dSet <- rhdf5::H5Dcreate(h5loc = totfile, name = 'TOTAL', dtype_id = tid, h5space = rhdf5::H5Screate_simple(c(3,10))) rhdf5::H5Dwrite(dSet, as.raw(matrix(c(0,1),3,10)), h5type = tid) rhdf5::h5closeAll() test <- rhdf5::H5Fopen(testFile,flags = "H5F_ACC_RDWR") testD <- rhdf5::H5Dopen(h5loc = test,name = 'TOTAL') print(testD[]) testD[1,1] <- as.raw(0L) testD[1,1] <- 'FALSE' print(testD[]) str(testD[])

In particular: testD[1,1] <- as.raw(0L) results in.... Error in H5Dwrite(h5dataset, obj, h5spaceMem = h5spaceMem, h5spaceFile = h5spaceFile) : HDF5. Dataset. Write failed. while, testD[1,1] <- 'FALSE' seems to work but it gives back Error in as.character.factor(x) : malformed factor Actually, if we explore what is in test: Factor[1:3, 1:10] w/ 2 levels "TRUE","FALSE": 0 1 2 1 2 1 2 1 2 1 ... there is something weird.

I suppose if I load the entire dataset as an array, make changes and do rhdf5::H5Dwrite(...) again is going to work... but I would prefer not....

Is there a way to use replacement in this situation?

Thank you in advance for your help,

Luigi

sessionInfo:

R version 4.4.0 (2024-04-24 ucrt) Platform: x86_64-w64-mingw32/x64 Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale: [1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8

time zone: Europe/London tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached): [1] compiler_4.4.0 tools_4.4.0 rhdf5_2.48.0 rhdf5filters_1.16.0 [5] Rhdf5lib_1.26.0