I am using HiCexplorer for downstream analysis of our Hi-C data. To view the results I wanted to use HiGlass which uses the multicooler format to view the data at different resolutions. The HiCexplorer provides a nice utility for conversion between the h5 and cooler format. However, when trying to view the results in HiGlass i get an import error. In brief, my script does the following:
After investigating this issue I found that the way the conversion is implemented by the hicmatrix library contains a bug. In particular, what HiGlass seems to try when importing a dataset is reading in the metadata of the cooler containers in the multicooler as JSON. While this is fine for the coarser resolutions generated with cooler zoomify the metadata of the cool file generated with hicConvertMatrix cannot be read since it contains binary objects. A quick check with cooler attrs gives:
These values cannot be interpreted during the JSON file generation and therefore the import to HiGlass fails.
A quick lookup in the cool.py file of the hicmatrix library reveals the source of this. On line 364 - 397 the info dictionary of the new cooler file is generated where string conversion is explicitly handled by numpy.string_. However, the hdf5 library seems to be unable to understand this datatype and converts it to a binary object. Replacing numpy.string_ with the native Python str function resolves this problem and a quick check with cooler attrs gives:
I am using HiCexplorer for downstream analysis of our Hi-C data. To view the results I wanted to use HiGlass which uses the multicooler format to view the data at different resolutions. The HiCexplorer provides a nice utility for conversion between the h5 and cooler format. However, when trying to view the results in HiGlass i get an import error. In brief, my script does the following:
After investigating this issue I found that the way the conversion is implemented by the
hicmatrix
library contains a bug. In particular, what HiGlass seems to try when importing a dataset is reading in the metadata of the cooler containers in the multicooler as JSON. While this is fine for the coarser resolutions generated withcooler zoomify
the metadata of the cool file generated withhicConvertMatrix
cannot be read since it contains binary objects. A quick check withcooler attrs
gives:These values cannot be interpreted during the JSON file generation and therefore the import to HiGlass fails.
A quick lookup in the
cool.py
file of thehicmatrix
library reveals the source of this. On line 364 - 397 the info dictionary of the new cooler file is generated where string conversion is explicitly handled bynumpy.string_
. However, the hdf5 library seems to be unable to understand this datatype and converts it to a binary object. Replacingnumpy.string_
with the native Pythonstr
function resolves this problem and a quick check withcooler attrs
gives:I therefore propose to change replace
numpy.string_
withstr
to ensure compatibility with HiGlass.