r3fang / SnapATAC

Analysis Pipeline for Single Cell ATAC-seq
GNU General Public License v3.0
301 stars 125 forks source link

HDF5 read fails #96

Open dawe opened 5 years ago

dawe commented 5 years ago

I'm trying to use snapATAC on our (pretty old?) compute platform. First of all it should be said that it doesn't support HDF5 file locking due to lustre implementation. This, in the past, has not been a problem as I usually export the env variable

$ export HDF5_USE_FILE_LOCKING=FALSE

this works when dealing with hdf5 files from scanpy and even snaptools (i.e. I'm able to read/write snap files in python). Whenever I try to initialize an object with SnapATAC I get this:

x.sp = createSnap(file = file.list, sample=sample.list)
Epoch: reading the barcode session ...
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in data.frame(barcode, TN, UM, PP, UQ, CM) : 
  arguments imply differing number of rows: 0, 19998

To make SnapATAC work I have to initially import snap files on another system and save rds object back to the cluster. I can process data up to the clustering part but other steps are not possible (e.g. running MACS doesn't work as it requires access to the original snap file from R). The whole thing may be related to rhdf5 and Rhdf5lib, yet I would like to debug this from snapatac.

In reality, rhdf5 seems able to access the file properly:

> h5ls(file.list[1])
       group        name       otype  dclass      dim
0          /          AM   H5I_GROUP                 
1        /AM       10000   H5I_GROUP                 
2  /AM/10000    binChrom H5I_DATASET  STRING   321184
3  /AM/10000    binStart H5I_DATASET INTEGER   321184
4  /AM/10000       count H5I_DATASET INTEGER 54488036
5  /AM/10000         idx H5I_DATASET INTEGER 54488036
6  /AM/10000         idy H5I_DATASET INTEGER 54488036
7        /AM        5000   H5I_GROUP                 
8   /AM/5000    binChrom H5I_DATASET  STRING   642098
9   /AM/5000    binStart H5I_DATASET INTEGER   642098
10  /AM/5000       count H5I_DATASET INTEGER 58147170
11  /AM/5000         idx H5I_DATASET INTEGER 58147170
12  /AM/5000         idy H5I_DATASET INTEGER 58147170
13       /AM binSizeList H5I_DATASET INTEGER        2
14       /AM    nBinSize H5I_DATASET INTEGER    ( 0 )
15         /          BD   H5I_GROUP                 
16       /BD          CM H5I_DATASET INTEGER    19998
17       /BD          PE H5I_DATASET INTEGER    19998
18       /BD          PL H5I_DATASET INTEGER    19998
19       /BD          PP H5I_DATASET INTEGER    19998
20       /BD          SA H5I_DATASET INTEGER    19998
21       /BD          SE H5I_DATASET INTEGER    19998
22       /BD          TN H5I_DATASET INTEGER    19998
23       /BD          UM H5I_DATASET INTEGER    19998
24       /BD          UQ H5I_DATASET INTEGER    19998
25       /BD          US H5I_DATASET INTEGER    19998
26       /BD        name H5I_DATASET  STRING    19998
27         /          FM   H5I_GROUP                 
28       /FM  barcodeLen H5I_DATASET INTEGER    19998
29       /FM  barcodePos H5I_DATASET INTEGER    19998
30       /FM   fragChrom H5I_DATASET  STRING 67423887
31       /FM     fragLen H5I_DATASET INTEGER 67423887
32       /FM   fragStart H5I_DATASET INTEGER 67423887
33         /          HD   H5I_GROUP                 
34       /HD          AL   H5I_GROUP                 
35    /HD/AL          CL H5I_DATASET  STRING    ( 0 )
36    /HD/AL          ID H5I_DATASET  STRING    ( 0 )
37    /HD/AL          PN H5I_DATASET  STRING    ( 0 )
38    /HD/AL          VN H5I_DATASET  STRING    ( 0 )
39       /HD          CL H5I_DATASET  STRING    ( 0 )
40       /HD          CW H5I_DATASET  STRING    ( 0 )
41       /HD          DT H5I_DATASET  STRING    ( 0 )
42       /HD          MG H5I_DATASET  STRING    ( 0 )
43       /HD          SQ   H5I_GROUP                 
44    /HD/SQ          ID H5I_DATASET  STRING    ( 0 )
45    /HD/SQ          SL H5I_DATASET INTEGER      455
46    /HD/SQ          SN H5I_DATASET  STRING      455
47       /HD          VN H5I_DATASET  STRING    ( 0 )

and

> h5f = H5Fopen(file.list[1])
> h5f   
HDF5 FILE 
        name /
    filename 

  name     otype dclass dim
0   AM H5I_GROUP           
1   BD H5I_GROUP           
2   FM H5I_GROUP           
3   HD H5I_GROUP          

but then it fails.

> h5f$AM
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  : 
  HDF5. Dataset. Read failed.
$`10000`
$`10000`$binChrom
NULL

$`10000`$binStart
NULL

$`10000`$count
NULL

$`10000`$idx
NULL

$`10000`$idy
NULL

$`5000`
$`5000`$binChrom
NULL

$`5000`$binStart
NULL

$`5000`$count
NULL

$`5000`$idx
NULL

$`5000`$idy
NULL

$binSizeList
[1]  5000 10000

$nBinSize
[1] 2
r3fang commented 4 years ago

Maybe this closed issue will help https://github.com/r3fang/SnapATAC/issues/114