bartongroup / RATS

Relative Abundance of Transcripts: An R package for the detection of Differential Transcript isoform Usage.
MIT License
32 stars 1 forks source link

problem parsing kallisto abundance.h5 #56

Closed fruce-ki closed 6 years ago

fruce-ki commented 6 years ago

This issue seems to be unrelated to #55 .

EDIT: If any users out there experience this error, please let us know. As is, I am unable to pinpoint the cause of it. I only know it is not caused by RATs. /EDIT

h5read('./kallisto_quant/Hs_GRCh37.67.1/abundance.h5', '/aux/ids/') succeeds on my mac but fails on both the login node and the HPC nodes.

Mac session:

R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rhdf5_2.22.0        data.table_1.10.4-3

loaded via a namespace (and not attached):
[1] zlibbioc_1.24.0 compiler_3.4.1  tools_3.4.1   

Ningal session:

R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.8 (Final)

Matrix products: default
BLAS: /homes/kfroussios/local_installs/miniconda3/envs/mybasics/lib/R/lib/libRblas.so
LAPACK: /homes/kfroussios/local_installs/miniconda3/envs/mybasics/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rhdf5_2.22.0         data.table_1.10.4-3  RevoUtils_10.0.8    
[4] RevoUtilsMath_10.0.1

loaded via a namespace (and not attached):
[1] zlibbioc_1.24.0 compiler_3.4.3 

HPC session:

R version 3.4.3 (2017-11-30)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /homes/kfroussios/local_installs/miniconda3/envs/nodebasics/lib/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rhdf5_2.22.0      data.table_1.11.0

loaded via a namespace (and not attached):
[1] zlibbioc_1.24.0 compiler_3.4.3 
fruce-ki commented 6 years ago

It does not seem to be an issue with the version of rhdf5 nor data.table as they are identical between the Mac that works and ningal that does not.

The only discernible difference is R itself, 3.4.1 on the one that works and 3.4.3 on those that don't work.

Inevitably I have to test with the latest R.

fruce-ki commented 6 years ago

The issue is NOT reproducible on Mac OSX with R 3.4.3 and rhdf5 2.22.0. It could be another mangled configuration issue on our HPC? But why only the kallisto files and not the salmon/wasabi ones?

fruce-ki commented 6 years ago

Meanwhile work-around is by employing kallisto's h5dump subcommand, to export the .h5 to plaintext that can be parsed manually in a loop to create the required lists of tables for RATs.

fruce-ki commented 6 years ago

The workaround has now been implemented as part of fish4rodents(). It is now possible to load the bootstrap data from kallisto's plaintext format instead of extracting from the abundance.h5 file.

This does not fix the problem, but I do not believe the problem is caused by an error in RATs, so I consider this issue closed from our perspective.