Closed BenxiaHu closed 3 years ago
Hi It doesn't seem to be very different from the .matrix file format generated by HiC-Pro. A simple awk/python scripting should do the job. Best
here is result from 10kb normalized .matrix:
So the first 2 columns are multiplied by 10kb, right? However, how to detemine which rows are from which chromosomes? Best,
ah sorry, I just understood. No. This file is a triplet sparese format file, with i, j, k i and j are the indices in the matrix. k is the count.
To make the correspondance between i, j and the genome coordinate, you have a bed file with the matrix. Usually, the bed file is only with the raw data, and normalized and raw data have the same coordinates.
Thanks. here is the bed file: the screenshot of matrix file:
the row number of bed file is different from that of matrix file. I am a little confused about how to define the genome coordinates for the matrix file based on the bed file generated by HiC-Pro.
In the BED file you have ; chr / start / end / BIN_ID
In the matrix file you have : BIN_ID / BIN_ID / counts
Is that better ?
got it. thanks.
Hi, I have run HiC-Pro to obtained ICED normalized matrix. Now I want to convert the matrix generated by HiC-Pro to the following format:
Do you have any suggestions? Best,