grimbough / rhdf5

Package providing an interface between HDF5 and R
http://bioconductor.org/packages/rhdf5
60 stars 21 forks source link

Read fails with R version 4.0.2 #71

Closed sophiaschaff closed 3 years ago

sophiaschaff commented 3 years ago

First a big thank you to the developers! I've been enthusiastically using your package for years!

Somehow since I updated to R version 4.0.2, package version rhdf5_2.32.2, from 3.6.2 and rhdf5_2.30.1, I no longer am able to h5read the very same .h5 files that work perfectly with the old versions. I get the error message:

> area_sim <- h5read("Data.h5", "/Raw_Data")$Area
Fehler in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Zusätzlich: Warnmeldung:
In H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  :
  Cannot coerce multi-dimensional data to data.frame. Data returned as a list.

The warning message also appears in the old version.

I attached an example file, but wasn't able to pin down the error. Sorry.

Thanks in advance and I would be very glad for any help. Sophia

grimbough commented 3 years ago

Hi Sophia,

Thanks for the report. I can confirm I see the same issue on my system. I'm not immediately sure what the issue is, so I'll take a look and report back here once I know a bit more about it.

grimbough commented 3 years ago

After some trial and error with old versions, it looks like this issue was introduced in commit 14e39d88087cf1d13a58f1164de2ce67c0b0751f . Now to understand exactly what's causing it.

grimbough commented 3 years ago

Hopefully this is solved in the latest version here on Github

library(rhdf5)
url <- "https://github.com/grimbough/rhdf5/files/5282189/Data.zip"
down_file <- tempfile()
download.file(url, destfile = down_file)
h5file <- unzip(zipfile = down_file,
      files = "Data.h5",
      exdir = tempdir())
area_sim <- h5read(h5file, "/Raw_Data", compoundAsDataFrame = FALSE)$Area
head(t(area_sim))
#>          [,1]
#> [1,] 7407.981
#> [2,] 7478.995
#> [3,] 7411.445
#> [4,] 7432.230
#> [5,] 7414.043
#> [6,] 7428.766

You can try if for yourself by installing the version from here:

BiocManager::install('grimbough/rhdf5')

If it complains about missing the hdf5filters package, you can install that via

BiocManager::install('grimbough/rhdf5filters')

It'd be great if you could confirm whether this works, and if it looks good I'll merge it into the version on Bioconductor.

sophiaschaff commented 3 years ago

Thank you for looking into the matter so quickly.

However I unfortunately don't manage to install your current version from github. I tried both on a windows and a linux machine. On windows I encounter the following error:

> BiocManager::install('grimbough/rhdf5')
Bioconductor version 3.11 (BiocManager 1.30.10), R 4.0.2 (2020-06-22)
Installing github package(s) 'grimbough/rhdf5'
Downloading GitHub repo grimbough/rhdf5@HEAD
√  checking for file 'C:\Users\Sophia\AppData\Local\Temp\Rtmp6pSo0h\remotes133c1eee5d21\grimbough-rhdf5-b3b6c78/DESCRIPTION' (445ms)
-  preparing 'rhdf5': (955ms)
√  checking DESCRIPTION meta-information ... 
-  cleaning src
-  running 'cleanup.win'
-  checking for LF line-endings in source and make files and shell scripts (409ms)
-  checking for empty or unneeded directories
-  building 'rhdf5_2.33.10.tar.gz'
   Warnung: file 'rhdf5/cleanup' did not have execute permissions: corrected
   Warnung: file 'rhdf5/configure' did not have execute permissions: corrected

* installing *source* package 'rhdf5' ...
** using staged installation
** libs
Fehler: (konvertiert von Warnung) this package has a non-empty 'configure.win' file,
so building only the main architecture
* removing 'C:/Users/Sophia/Documents/R/win-library/4.0/rhdf5'
* restoring previous 'C:/Users/Sophia/Documents/R/win-library/4.0/rhdf5'
Fehler: Failed to install 'rhdf5' from GitHub:
  (konvertiert von Warnung) installation of package ‘C:/Users/Sophia/AppData/Local/Temp/Rtmp6pSo0h/file133c75cb4a1b/rhdf5_2.33.10.tar.gz’ had non-zero exit status

Sorry I can't report back positiv.

Best, Sophia

grimbough commented 3 years ago

I hadn't appreciated you were using Windows. It can be a bit complex trying to do the Github build on Windows.

I'm fairly comfortable with the changes, so I'll put them directly into Bioconductor. It will take a couple of days before they're available from there, but it should be easier for you to install.

I'll update here when they're ready.

grimbough commented 3 years ago

The latest version (2.32.3) should now be in Bioconductor, and you can install it with BiocManager::install("rhdf5")

This lets me read your example file.

packageVersion('rhdf5')
#> [1] '2.32.3'

library(rhdf5)
url <- "https://github.com/grimbough/rhdf5/files/5282189/Data.zip"
down_file <- tempfile()
download.file(url, destfile = down_file)
h5file <- unzip(zipfile = down_file,
                files = "Data.h5",
                exdir = tempdir())
area_sim <- h5read(h5file, "/Raw_Data", compoundAsDataFrame = FALSE)$Area
head(t(area_sim))
#>          [,1]
#> [1,] 7407.981
#> [2,] 7478.995
#> [3,] 7411.445
#> [4,] 7432.230
#> [5,] 7414.043
#> [6,] 7428.766

Please let me know if you run into any more difficulties