Just making a comment, as of hic-straw 1.1.0 there is a new format to access HiC file data which is still being used in Mustache. To make it future proof you can switch to this newer version which just slightly modifies a few lines of code here: https://github.com/ay-lab/mustache/blob/master/mustache/mustache.py
Lines 307-329 are currently:
if not CHRM_SIZE:
hic = hicstraw.HiCFile(f)
chromosomes = hic.getChromosomes()
chrSize_in_bp = {}
for i in range(1, len(chromosomes)):
chrSize_in_bp["chr" + str(chromosomes[i].name).replace("chr", '')] = chromosomes[i].length
CHRM_SIZE = chrSize_in_bp["chr" + chr1.replace("chr", '')]
CHUNK_SIZE = max(2 * distance_in_bp / res, 2000)
start = 0
end = min(CHRM_SIZE, CHUNK_SIZE * res) # CHUNK_SIZE*res
result = []
val = []
while start < CHRM_SIZE:
print(int(start), int(end))
if not norm_method:
temp = hicstraw.straw("observed", "KR", f, str(chr1) + ":" + str(int(start)) + ":" + str(int(end)),
str(chr2) + ":" + str(int(start)) + ":" + str(int(end)), "BP", res)
else:
temp = hicstraw.straw("observed", str(norm_method), f,
str(chr1) + ":" + str(int(start)) + ":" + str(int(end)),
str(chr2) + ":" + str(int(start)) + ":" + str(int(end)), "BP", res)
And they can now be switched to the new hic-straw version using:
# create a hicstraw object linked to file f
hic = hicstraw.HiCFile(f)
# get chrom data if needed
if not CHRM_SIZE:
chromosomes = hic.getChromosomes()
chrSize_in_bp = {}
for i in range(1, len(chromosomes)):
chrSize_in_bp["chr" + str(chromosomes[i].name).replace("chr", '')] = chromosomes[i].length
CHRM_SIZE = chrSize_in_bp["chr" + chr1.replace("chr", '')]
CHUNK_SIZE = max(2 * distance_in_bp / res, 2000)
start = 0
end = min(CHRM_SIZE, CHUNK_SIZE * res) # CHUNK_SIZE*res
result = []
val = []
# extract count data
while start < CHRM_SIZE:
print(int(start), int(end))
if not norm_method:
mzd = hic.getMatrixZoomData(str(chr1), str(chr2), "observed", "KR", "BP", res)
temp = mzd.getRecordsAsMatrix(int(start), int(end), int(start), int(end))
else:
mzd = hic.getMatrixZoomData(str(chr1), str(chr2), "observed", str(norm_method), "BP", res)
temp = mzd.getRecordsAsMatrix(int(start), int(end), int(start), int(end))
Just making a comment, as of hic-straw 1.1.0 there is a new format to access HiC file data which is still being used in Mustache. To make it future proof you can switch to this newer version which just slightly modifies a few lines of code here: https://github.com/ay-lab/mustache/blob/master/mustache/mustache.py
Lines 307-329 are currently:
And they can now be switched to the new hic-straw version using:
For more info checkout: https://pypi.org/project/hic-straw/