Closed cgirardot closed 2 years ago
I am unable to replicate this error with the test.mcool in open2c_examples in a conda environment from the open2c enviornment.yml. Can you provide some more details as to how your cooler was created, etc?
Hi, I used hicExplorer (3.6) to build the hic matrices on multiple tech rep (ie lane) and 2 biol rep. Here, all reps (bio and tech) are merged to a unique matrice (summed up). I have then kept only the main chromosomes and also removed the inter-chromosomal contacts using hicAdjustMatrix. Finally, cool conversion uses hicConvertFormat.
I can share a matrix if it helps
thx
can you double check the dtypes of the converted matrix and print what cooler.info() gives? I'm wondering if hicConvertFormat casts to floats.
I am not sure how to check the dtypes
. How can I check ?
Also, I think I was wrong about the steps: due to this bug I had to prune the matrix myself using this piece of code:
hicConvertFormat MY_H5 -> MY_COOL
cooler dump --join MY_COOL | awk '$1==$4 {print $0}' | gzip > MY_BEDPE
cooler load --assembly dm6 -f bg2 chr_file:5000 MY_BEDPE MY_TRIMMED_COOL
cooler info :
{
"bin-size": null,
"bin-type": "variable",
"creation-date": "2022-03-24T12:14:39.597384",
"format": "HDF5::Cooler",
"format-url": "https://github.com/mirnylab/cooler",
"format-version": 3,
"generated-by": "HiCMatrix-16",
"generated-by-cooler-lib": "cooler-0.8.11",
"genome-assembly": "unknown",
"nbins": 25148,
"nchroms": 6,
"nnz": 26246396,
"storage-mode": "symmetric-upper",
"sum": 377180269.0,
"tool-url": "https://github.com/deeptools/HiCMatrix"
}
you can inspect the individual pixels in the cooler if you load it interactively
clr = cooler.Cooler('mycooler.cool')
then
(clr.pixels()[:3]).dtypes
thx for the piece of code. The count slot is indeed float64
, which I guess come from the previous manipulation steps.
I am pretty sure these numbers are int
. Is there a way I can check my numbers are all integers and then re-save this cool file with the right type?
Thank you
This is a typical way with pandas dataframes: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.astype.html
closing this for now, but feel free to re-open (though this may be better addressed in HicExplorer issues as this looks like an issue with hicConvertFormat)
I am trying to :
but I am getting:
I am not sure why my cool has float (it is raw data so these numbers are integers).
Thank you