koustav-pal / HiCBricks

HiCBricks offers user-friendly and efficient solutions for handling large high-resolution Hi-C datasets. The package provides a R/Bioconductor framework with the bricks to build more complex data analysis pipelines and algorithms.
Other
3 stars 4 forks source link

problem loading mcool file #5

Closed imerelli closed 5 years ago

imerelli commented 5 years ago

Hi, I'm trying to use HiCBricks (I tried both the GitHub version and the bioconconductor version) with a mcool file converted from juicer using the suggested script (https://github.com/4dn-dcic/hic2cool) which seems to work properly, but I found the following errors:

Output.brick <- CreateBrick_from_mcool(Brick = "/tmp/prova.brick", mcool = "/DATA_NFS/HI-C/monica/juicer_results/132/inter_30.mcool", remove.existing = TRUE)

Can you help me?

Minor Issues: 1) tmp.dir() should be tmpdir() (at least in my R 3.5.1 version) 2) The function Brick_list_mcool_resolutions does not exist

koustav-pal commented 5 years ago

Hi @imerelli,

Sorry for the delayed response.

Can you provide the complete error message?

Also, can you please provide the version of R and HiCBricks you used to get the error?

There are many branches of HiCBricks, so it would help to know which branch contains the error.

imerelli commented 5 years ago

Hi, I tried both with the Bioconductor version of the package and with the github latest version (main branch). Here the full error:

Output.brick <- CreateBrick_from_mcool(Brick = "/tmp/prova.brick",mcool = "/DATA_NFS/HI-C/monica/juicer_results/132/inter_30.mcool", remove.existing = TRUE) Error in ReturnH5Attribute(Handle = Brick.handler, name = An.attribute, : Attribute format-versionnot found in HDF file.

Here the information about my error session: R version 3.5.1 (2018-07-02) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

HiCBricks_1.1.21 R6_2.4.0 rhdf5_2.26.2

In case, I can provide you the file that is raising the error.

koustav-pal commented 5 years ago

Hi @imerelli,

Can you please provide the mcool file that is raising the error?

Also, can you please confirm that the version being used is the github release version of HiCBricks?

koustav-pal commented 5 years ago

Hi @imerelli,

I installed the hic2cool utility and tried to convert a .hic file to cool format.

The bug is not in this package, but in the hic2cool converter.

I tried to use this file: https://data.4dnucleome.org/files-processed/4DNFIH3OTR14/

After converting it to cool format, I found that many of the attributes which are defined as required attributes as per the cooler schema, version2 and version3 are not introduced by the converter.

format; Length: 1; value: HDF5::Cooler
Error in ReturnH5Attribute(Handle = Brick.handler, name = An.attribute,  :
  Attribute format-versionnot found in HDF file.

format-version; Length: 1; value: Error in ReturnH5Attribute(Handle = Brick.handler, name = An.attribute,  :
  Attribute format-versionnot found in HDF file.

bin-type; Length: 1; value: fixed
bin-size; Length: 1; value: 50000
Error in ReturnH5Attribute(Handle = Brick.handler, name = An.attribute,  :
  Attribute storage-modenot found in HDF file.

storage-mode; Length: 1; value: Error in ReturnH5Attribute(Handle = Brick.handler, name = An.attribute,  :
  Attribute storage-modenot found in HDF file.

Please note, that as is, there is no bug in HiCBricks as we are adherent to the cooler specification. This is a bug of the hic2cool package.

Based on your feedback and requirements I will implement a bugfix on another branch allowing you to read cooler files without the cooler version sanity check. I will myself open an issue on the hic2cool page.

koustav-pal commented 5 years ago

Hi @imerelli,

The issue has been fixed in the hic2cool package. Please uninstall and reinstall version 0.6.0.

This should fix the bug.

Let me know if this works.

imerelli commented 5 years ago

Hi, to be honest the new version is not working for me. Here the error (the same as before):

$ hic2cool -v
hic2cool 0.6.0
$ hic2cool convert inter_30.hic inter_30.mcool
##########################
### hic2cool / convert ###
##########################
### Header info from hic
... Chromosomes:  [u'ALL', u'1', u'2', u'3', u'4', u'5', u'6', u'7', u'X', u'8', u'9', u'10', u'11', u'12', u'13', u'14', u'15', u'16', u'17', u'18', u'20', u'Y', u'19', u'22', u'21', u'3020']
... Resolutions:  [2500000, 1000000, 500000, 250000, 100000, 50000, 25000, 10000, 5000]
... Normalizations:  [u'VC', u'VC_SQRT', u'KR']
... Genome:  /gpfs/scratch/userexternal/imerelli/opt/juicer/references/hg19custom3020.chrom.sizes
### Converting
... Resolution 2500000 took: 32.4594581127 seconds.
... Resolution 1000000 took: 35.347235918 seconds.
... Resolution 500000 took: 76.4222400188 seconds.
... Resolution 250000 took: 160.234127045 seconds.
... Resolution 100000 took: 242.761617899 seconds.
... Resolution 50000 took: 341.699584007 seconds.
... Resolution 25000 took: 385.375012875 seconds.
... Resolution 10000 took: 451.892506123 seconds.
... Resolution 5000 took: 566.024144173 seconds.
### Finished! Output written to: inter_30.mcool
... This file is higlass compatible.
$ R

R version 3.5.1 (2018-07-02) -- "Feather Spray"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

> library("HiCBricks")
Carico il pacchetto richiesto: curl
Carico il pacchetto richiesto: rhdf5
Carico il pacchetto richiesto: R6
Carico il pacchetto richiesto: grid
> Output.brick <- CreateBrick_from_mcool(Brick = "/tmp/prova.brick",mcool = "/DATA_NFS/HI-C/monica/juicer_results/132/inter_30.mcool", remove.existing = TRUE)
Error in ReturnH5Attribute(Handle = Brick.handler, name = An.attribute,  : 
  Attribute format-versionnot found in HDF file.
> 
koustav-pal commented 5 years ago

Hi @imerelli,

On my side, it is working ok. Without a common test dataset, it is hard to diagnose the problem.

Can you please try to run the same steps on the file listed here?

https://data.4dnucleome.org/files-processed/4DNFIH3OTR14/@@download/4DNFIH3OTR14.hic

Furthermore, to cut down on time, please run this bit of code from a python terminal

from hic2cool import hic2cool_convert
hic2cool_convert("4DNFIH3OTR14.hic", "4DNFIH3OTR14.cool", 100000, True, False)
imerelli commented 5 years ago

Hi, after some tests the problem is in converting by selecting all the resolutions. Using a single resolution everything works fine. But if I put 0 in the command line above (=all resolutions) the output file can't be loaded in HiCBricks.

koustav-pal commented 5 years ago

Hi @imerelli,

Thank you for your patience and collaboration on this issue.

I have downloaded one mcool file from the 4DN data portal. In this case, the file import was working fine. Furthermore, I went back to the cooler specification and found that the interpretation of the schema by my package is correct. Meaning, that the fix implemented in hic2cool v 0.6.0 is incomplete.

Therefore, I have notified the developers of the hic2cool and hope to have a fix for your issue by the end of the day.

koustav-pal commented 5 years ago

Closing, due to user inactivity