aidenlab / straw

Extract data quickly from Juicebox via straw
MIT License
61 stars 36 forks source link

Segmentation fault with plot_hic_map #131

Open AlcaArctica opened 8 months ago

AlcaArctica commented 8 months ago

Hi, I have been trying out straw to plot my hic map. I would like to get a similar image to the one I get when I load the .hic file (obtained with yahs + juicer pre) into juicebox.

My python script is adapted from the example provided in the colab notebook: https://colab.research.google.com/drive/1-GG-n-p9nZ7Be82UVJG7n3Q_wQ9IeuFN?usp=sharing#scrollTo=Fg82rZuafiz_

#  loading the .hic file into an object
hic = hicstraw.HiCFile("./out_JBAT.hic")

# get matrix object for all chromosomes
matrix_object = hic.getMatrixZoomData('assembly', 'assembly', "observed", "NONE", "BP", 5000)
numpy_matrix_object = matrix_object.getRecordsAsMatrix(0, 1000000, 0, 1000000)

# plot matrix
REDMAP = LinearSegmentedColormap.from_list("bright_red", [(1,1,1),(1,0,0)])
def plot_hic_map(dense_matrix, maxcolor):
    plt.matshow(dense_matrix, cmap=REDMAP, vmin=0, vmax=maxcolor)
    plt.show()
plot_hic_map(numpy_matrix_object, 30)

This works fine, when I only want to plot a small section, but I get the error Segmentation fault (core dumped) whenever I try to us the complete length of my assembly (1237179085 bp) in the matrix_object.getRecordsAsMatrix function. (I want the zoomed out all-by-all view that is used for Juicebox visualization). I have a large computer cluster available, so computational resources should be sufficient.. I cannot reduce the resolution, as I am already supplying the hic.getMatrixZoomData function with the lowest resolution available, which is 5000. Any ideas?