dynverse / dyno

Inferring, interpreting and visualising trajectories using a streamlined set of packages 🦕
https://dynverse.github.io/dyno
Other
166 stars 32 forks source link

HDF5 Errors when attempting to run trajectory inference #43

Closed bjreisman closed 5 years ago

bjreisman commented 5 years ago

Fantastic effort on this package, I'm really looking forward to being able to use TI methods from a common interface in R. I've followed the installation instructions and successfully installed the packages and passed the docker installation test, but I'm having difficulty running the code with the example data or my own data.

For example, running the ti_comp1() on the example_dataset, I get an error on related to HDF5.

> library(dyno)
> model <- infer_trajectory(example_dataset, ti_comp1(), verbose = T)
Executing 'comp1' on 'example'
With parameters: list(dimred = "pca", ndim = 2L, component = 1L),
inputs: expression, and
priors : 
Input saved to C:\Users\reismab\AppData\Local\Temp\RtmpOK0ZZ0\file19089056d27/ti
Running "C:\PROGRA~1\Docker\Docker\RESOUR~1\bin\docker.exe" run -e "TMPDIR=/tmp2" --workdir /ti/workspace -v \
  "/c/Users/reismab/AppData/Local/Temp/RtmpOK0ZZ0/file19089056d27/ti:/ti" -v \
  "/c/Users/reismab/AppData/Local/Temp/RtmpOK0ZZ0/file19083e865acc/tmp:/tmp2" "dynverse/ti_comp1:v0.9.9" --dataset \
  /ti/input.h5 --output /ti/output.h5
Loading required package: dynutils
Error: Error during trajectory inference 
HDF5-API Errors:
    error #000: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5Tget_native_type(): line 119: cannot retrieve native type
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #001: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5T_get_native_type(): line 400: cannot get member value
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #002: C:\pkg\hdf5-1.8.14\src\H5T.c in H5T_convert(): line 4816: data type conversion failed
        class: HDF5
        major: Attribute
        minor: Unable to encode value

    error #003: C:\pkg\hdf5-1.8.14\src\H5Tconv.c in H5T__conv_i_i(): line 3639: can't find property list for ID
        class: HDF5
        major: Object atom
        minor: Unable to find atom information (already closed?)

    error #004: C:\pkg\hdf5-1.8.14\src\H5Pint.c in H5P_object_verify(): line 3381: property list is no

I get similar errors when trying to run ti_PAGA() or ti_wanderlust() with the example datasets as well as the fibroblast_reprogramming_treutlein dataset. I've also tried restarting R, docker, and windows without success. I'm not sure if this is a bug, or [perhaps more likely] an issue with my installation of the packages/docker. Unfortunately, I don't know how to begin to debug this but I've included my session info below.

> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] shiny_1.2.0         dyno_0.9.9          dynwrap_1.0.0       dynplot_1.0.0       dynmethods_1.0.0    dynguidelines_1.0.0
[7] dynfeature_1.0.0   

loaded via a namespace (and not attached):
  [1] colorspace_1.4-1      rprojroot_1.3-2       dynparam_1.0.0        htmlTable_1.13.1      base64enc_0.1-3      
  [6] fs_1.2.7              rstudioapi_0.10       rje_1.9               farver_1.1.0          remotes_2.0.2        
 [11] dynutils_1.0.2        bit64_0.9-7           ggrepel_0.8.0         ranger_0.11.2         codetools_0.2-16     
 [16] splines_3.5.3         knitr_1.22            polyclip_1.10-0       pkgload_1.0.2         Formula_1.2-3        
 [21] jsonlite_1.6          cluster_2.0.7-1       ggforce_0.2.1         readr_1.3.1           compiler_3.5.3       
 [26] backports_1.1.3       assertthat_0.2.1      Matrix_1.2-15         lazyeval_0.2.2        cli_1.1.0            
 [31] later_0.8.0           tweenr_1.0.1          acepack_1.4.1         htmltools_0.3.6       prettyunits_1.0.2    
 [36] tools_3.5.3           igraph_1.2.4          gtable_0.3.0          glue_1.3.1            reshape2_1.4.3       
 [41] dplyr_0.8.0.1         Rcpp_1.0.1            GA_3.2                iterators_1.0.10      ggraph_1.0.2         
 [46] xfun_0.6              stringr_1.4.0         ps_1.3.0              testthat_2.0.1        akima_0.6-2          
 [51] mime_0.6              devtools_2.0.1        MASS_7.3-51.1         scales_1.0.0          tidygraph_1.1.2      
 [56] babelwhale_0.0.0.9000 hms_0.4.2             promises_1.0.1        RColorBrewer_1.1-2    yaml_2.2.0           
 [61] curl_3.3              memoise_1.1.0         gridExtra_2.3         ggplot2_3.1.0         dyndimred_1.0.0      
 [66] rpart_4.1-13          latticeExtra_0.6-28   stringi_1.4.3         desc_1.2.0            foreach_1.4.4        
 [71] checkmate_1.9.1       shades_1.3.1          pkgbuild_1.0.3        rlang_0.3.3           pkgconfig_2.0.2      
 [76] lattice_0.20-38       purrr_0.3.2           patchwork_0.0.1       htmlwidgets_1.3       bit_1.1-14           
 [81] cowplot_0.9.4         pdist_1.2             processx_3.3.0        tidyselect_0.2.5      plyr_1.8.4           
 [86] magrittr_1.5          R6_2.4.0              Hmisc_4.2-0           pillar_1.3.1          foreign_0.8-71       
 [91] carrier_0.1.0         withr_2.1.2           sp_1.3-1              survival_2.43-3       nnet_7.3-12          
 [96] tibble_2.1.1          crayon_1.3.4          hdf5r_1.1.1           shinyWidgets_0.4.8    viridis_0.5.1        
[101] usethis_1.4.0         grid_3.5.3            data.table_1.12.0     callr_3.2.0           digest_0.6.18        
[106] xtable_1.8-3          tidyr_0.8.3           httpuv_1.5.0          munsell_0.5.0         viridisLite_0.3.0    
[111] shinyjs_1.0           sessioninfo_1.1.1  
> dynwrap::test_docker_installation(detailed = TRUE)
<U+2714> Docker is installed
<U+2714> Docker daemon is running
<U+2714> Docker is at correct version (>1.0): 1.39
<U+2714> Docker is in linux mode
<U+2714> Docker can pull images
<U+2714> Docker can run image
<U+2714> Docker can mount temporary volumes
<U+2714> Docker test successful -----------------------------------------------------------------
[1] TRUE
bjreisman commented 5 years ago

Sorry, realized that I forgot to pass the data through wrap_expression() which would have been an easy fix. Unfortunately, still running into the same error.

> library(dyno)
> library(tidyverse)
Registered S3 method overwritten by 'rvest':
  method            from
  read_xml.response xml2
-- Attaching packages --------------------------------------- tidyverse 1.2.1 --
v ggplot2 3.1.0       v purrr   0.3.0  
v tibble  2.0.1       v dplyr   0.8.0.1
v tidyr   0.8.2       v stringr 1.4.0  
v readr   1.3.1       v forcats 0.4.0  
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter()     masks stats::filter()
x purrr::flatten_df() masks hdf5r::flatten_df()
x dplyr::lag()        masks stats::lag()
> dataset <- wrap_expression(
+   counts = example_dataset$counts,
+   expression = example_dataset$expression
+ )
> model <- infer_trajectory(dataset, ti_comp1())
Error: Error during trajectory inference 
HDF5-API Errors:
    error #000: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5Tget_native_type(): line 119: cannot retrieve native type
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #001: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5T_get_native_type(): line 400: cannot get member value
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #002: C:\pkg\hdf5-1.8.14\src\H5T.c in H5T_convert(): line 4816: data type conversion failed
        class: HDF5
        major: Attribute
        minor: Unable to encode value

    error #003: C:\pkg\hdf5-1.8.14\src\H5Tconv.c in H5T__conv_i_i(): line 3639: can't find property list for ID
        class: HDF5
        major: Object atom
        minor: Unable to find atom information (already closed?)

    error #004: C:\pkg\hdf5-1.8.14\src\H5Pint.c in H5P_object_verify(): line 3381: property list is no
rcannood commented 5 years ago

Thanks Benjamin, we're looking into it.

Could you check whether a output.h5 was generated at /c/Users/reismab/AppData/Local/Temp/RtmpOK0ZZ0/file19089056d27/ti/ (or whatever folder is being printed). The error is probably being generated when calling dynutils::read_h5(file). If you find a file like this, could you send it to us?

bjreisman commented 5 years ago

Thanks for looking into this! The folder was empty once the error was thrown, but I was able to find the files by stopping the script just after it loaded dynutils. See below: input_output_h5.zip

zouter commented 5 years ago

Hi Benjamin, thanks! I can load in your output & input just fine. I have a hunch that this might be caused by an outdated version of HDF5. This will probably be fixed once you update to more recent versions (i.e. 1.10.5 or 1.8.21). I hope this is possible for you?

zouter commented 5 years ago

You might have to reinstall hdf5r if you do

bjreisman commented 5 years ago

Thanks! I had 1.10.5 installed previously and tried it again after uninstall and reinstalling 1.8.21, but ran into the same error with both versions.

If it's helpful, I also tried reading in the output.h5 file directly and it seemed to work? I was also able to run the examples from the hdf5r vignette successfully.

> library(hdf5r)
> hdf5r::is_hdf5("output.h5")
[1] TRUE
> output <- H5File$new("output.h5")
> names(output)
[1] "class" "data"  "names"
> data <- output[["data"]]
> data
Class: H5Group
Filename: C:\Users\reismab\OneDrive\Vanderbilt\Bachmann Lab\Notebook\2019\April 2019\Dyno\output.h5
Group: /data
Listing:
                        name  obj_type dataset.dims dataset.type_class
                    cell_ids H5I_GROUP         <NA>               <NA>
                   cell_info H5I_GROUP         <NA>               <NA>
                      dimred H5I_GROUP         <NA>               <NA>
           dimred_milestones H5I_GROUP         <NA>               <NA>
       dimred_segment_points H5I_GROUP         <NA>               <NA>
 dimred_segment_progressions H5I_GROUP         <NA>               <NA>
                    directed H5I_GROUP         <NA>               <NA>
          divergence_regions H5I_GROUP         <NA>               <NA>
                          id H5I_GROUP         <NA>               <NA>
               milestone_ids H5I_GROUP         <NA>               <NA>
< Printed 10, out of 16>
> version
               _                           
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          5.3                         
year           2019                        
month          03                          
day            11                          
svn rev        76217                       
language       R                           
version.string R version 3.5.3 (2019-03-11)
nickname       Great Truth  
bjreisman commented 5 years ago

I noticed that C:\pkg\hdf5-1.8.14\src\H5Tnative.c is not a valid path, there's no pkg directory in C:\, could that be the problem?

bjreisman commented 5 years ago

Ah ha! I was able to reproduce the error here, just as you had suspected:

> dynutils <- dynutils::read_h5("output.h5")
Error in standalone_H5D_get_type(h5d_id = self$id, native = TRUE) : 
  HDF5-API Errors:
    error #000: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5Tget_native_type(): line 119: cannot retrieve native type
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #001: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5T_get_native_type(): line 400: cannot get member value
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #002: C:\pkg\hdf5-1.8.14\src\H5T.c in H5T_convert(): line 4816: data type conversion failed
        class: HDF5
        major: Attribute
        minor: Unable to encode value

    error #003: C:\pkg\hdf5-1.8.14\src\H5Tconv.c in H5T__conv_i_i(): line 3639: can't find property list for ID
        class: HDF5
        major: Object atom
        minor: Unable to find atom information (already closed?)

    error #004: C:\pkg\hdf5-1.8.14\src\H5Pint.c in H5P_object_verify(): line 3381: property list is not a member of the class
        class

When I changed line 4 in dynutils::read_h5

file_h5 <- hdf5r::H5File$new(path, "r")

to

  file_h5 <- hdf5r::H5File$new(path)

at line 4 in read_h5, it was able to make it past this point.

Unfortunately it threw the same error when called read_h5_, at which point it failed at line 76 as follows:

Browse[3]> n
debug: out <- map(colnames, ~.read_h5_vec(data[[.]])) %>% data.frame(check.names = FALSE, 
    stringsAsFactors = FALSE)
Browse[3]> n
Error: Error during trajectory inference 
HDF5-API Errors:
    error #000: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5Tget_native_type(): line 119: cannot retrieve native type
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #001: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5T_get_native_type(): line 400: cannot get member value
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #002: C:\pkg\hdf5-1.8.14\src\H5T.c in H5T_convert(): line 4816: data type conversion failed
        class: HDF5
        major: Attribute
        minor: Unable to encode value

    error #003: C:\pkg\hdf5-1.8.14\src\H5Tconv.c in H5T__conv_i_i(): line 3639: can't find property list for ID
        class: HDF5
        major: Object atom
        minor: Unable to find atom information (already closed?)

    error #004: C:\pkg\hdf5-1.8.14\src\H5Pint.c in H5P_object_verify(): line 3381: property list is no

I might keep working at it if I can, but I thought I'd provide a partial update before it got too late on that side of the world 🌝.

zouter commented 5 years ago

Wow, great debugging so far! :1st_place_medal:

I'm just wondering why it says C:\pkg\hdf5-1.8.14 even though you have installed a more recent version of HDF5. I think this is the version of hdf5 that is installed by the hdf5r library (which is indeed 1.8.14 -> https://github.com/mannau/h5-libwin ). There is still an issue open for updating the windows version that is installed with CRAN hdf5r: hhoeflin/hdf5r#60 .

This will probably be solved if you install hdf5r from source (e.g. by doing devtools::install_github("hhoeflin/hdf5r") because this won't use the pre-build binaries but rather the ones installed on your system. This could be a temporary workaround for you until we find a more permanent solution :wink:

PS: It's never too late on this side of the world to answer issues :smiley:

bjreisman commented 5 years ago

I think you're right.

> hdf5r::h5version()
hdf5r version 1.1.1 with C-library HDF5 Version  1.8.14 
[1] "1.8.14"

While trying to troubleshoot this I noticed there's another package called rhdf5, which does seem to use the newer versions:

> rhdf5::h5version()
This is Bioconductor rhdf5 2.27.15 linking to C-library HDF5 1.10.3

I tried installing the development version of hdf5r via devtools::install_github() as well as by downloading it and trying to install it locally, but it still insisted on connecting to the 1.8.14 version. I suspect there is a way to get it to run a different version, but it looks like it would be very involved (altering the config.win file).

Do you think this is an issue with windows installations in general or something about my set-up specifically? I've tried it on two computers, but both were running windows 10.

Before I started messing with HDF5, I also dug a bit deeper into the read_h5 function. In debug mode hdf5r::H5File$new(path, "r") actually works just fine. The error seems to be thrown at line 60, as follows:

Browse[3]> map(nms, ~read_h5_(subfile[[.]]))
Error in standalone_H5D_get_type(h5d_id = self$id, native = TRUE) : 
  HDF5-API Errors:
    error #000: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5Tget_native_type(): line 119: cannot retrieve native type
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #001: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5T_get_native_type(): line 400: cannot get member value
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #002: C:\pkg\hdf5-1.8.14\src\H5T.c in H5T_convert(): line 4816: data type conversion failed
        class: HDF5
        major: Attribute
        minor: Unable to encode value

    error #003: C:\pkg\hdf5-1.8.14\src\H5Tconv.c in H5T__conv_i_i(): line 3639: can't find property list for ID
        class: HDF5
        major: Object atom
        minor: Unable to find atom information (already closed?)

    error #004: C:\pkg\hdf5-1.8.14\src\H5Pint.c in H5P_object_verify(): line 3381: property list is not a member of the class
        class

If I go through and call read_h5_(subfile[[nms[i]]]) for each i = 1:16, it fails on nms[5], nms[6], and nms[10], corresponding to "milestone_network", "divergence_regions", "directed."

On a related note, I saw you added a test_h5_installation() function. I tried it out and it passed 😃🙃

> dynutils::test_h5_installation(detailed = TRUE)
<U+2714> HDF5 files can be written
<U+2714> HDF5 files can be read
<U+2714> An R object that is written and read with HDF5 is the same
<U+2714> HDF5 test successful -------------------------------------------------------------------
[1] TRUE
zouter commented 5 years ago

Hmmmm yeah I think this is something that is general for Windows and hdf5r... Everyone using hdf5r on windows is going to have this problem. We're trying to find a solution, which will probably involve updating the version in hdf5r.. I don't have a lot of time today/tomorrow though, perhaps @rcannood can help.

And thanks for testing the test_h5_installation(), it clearly doesn't test enough :rofl:

rcannood commented 5 years ago

Got round to installing Win10 again, at the very least I can already replicate the problem. I'll work on finding a solution.


Error: Error during trajectory inference 
HDF5-API Errors:
    error #000: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5Tget_native_type(): line 119: cannot retrieve native type
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #001: C:\pkg\hdf5-1.8.14\src\H5Tnative.c in H5T_get_native_type(): line 400: cannot get member value
        class: HDF5
        major: Invalid arguments to routine
        minor: Inappropriate type

    error #002: C:\pkg\hdf5-1.8.14\src\H5T.c in H5T_convert(): line 4816: data type conversion failed
        class: HDF5
        major: Attribute
        minor: Unable to encode value

    error #003: C:\pkg\hdf5-1.8.14\src\H5Tconv.c in H5T__conv_i_i(): line 3639: can't find property list for ID
        class: HDF5
        major: Object atom
        minor: Unable to find atom information (already closed?)

    error #004: C:\pkg\hdf5-1.8.14\src\H5Pint.c in H5P_object_verify(): line 3381: property list is no
zouter commented 5 years ago

@rcannood Looking at the excellent debugging of Benjamin, I think this has something to do with booleans being saved by hdf5, given that these are present inside the "milestone_network", "divergence_regions" and "directed"

rcannood commented 5 years ago

:+1: I can confirm this as well. I can read/write a trajectory to files without a problem, but when they're produced by HDF5 1.10 I run into problems. A good solution would be to have h5-libwin update itself. For now, I'll try to find a workaround for writing h5 files with 1.10 that 1.8 is able to read. The downside is that all the containers will have to be updated.

rcannood commented 5 years ago

@bjreisman I implemented a workaround for now (I hope that h5-libwin will still be updated). All of the containers will need to be rebuilt, however. Before I start the rebuild procedure, could you verify that this version of slingshot works for you as well?

library(dyno)
data("fibroblast_reprogramming_treutlein")
model <- infer_trajectory(fibroblast_reprogramming_treutlein, "dynverse/ti_slingshot:dynwrapv2")
plot_graph(model)
bjreisman commented 5 years ago

@rcannood Thank you for taking the time to troubleshoot this, I can confirm that worked!

> library(dyno)
> data("fibroblast_reprogramming_treutlein")
> model <- infer_trajectory(fibroblast_reprogramming_treutlein, "dynverse/ti_slingshot:dynwrapv2")
> plot_graph(model)
Coloring by milestone
Using milestone_percentages from trajectory

image

zouter commented 5 years ago

Great! Feel free to open another issue if you have any other... issues :grin:

rcannood commented 5 years ago

Just to follow up on this issue; all of the containers have been rebuilt and the workaround has been merged into the dynmethods and dyno packages. If you reinstall dyno (and make sure to upgrade all the dependencies as well), you should be able to run each of the methods now.

bjreisman commented 5 years ago

Thank you, I can see that took a lot of work! The examples are running great on my machine. 🏆

I'm running into another issue with PAGA which I've posted here, but it looks to be an easier fix😄 https://github.com/dynverse/ti_paga/issues/3