nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
382 stars 183 forks source link

Question about .matrix file format #365

Closed rachelzhu123 closed 3 years ago

rachelzhu123 commented 4 years ago

Hello HiC-pro folks, I run this command to generate raw matrix file cat hic_results/data/mysample/mysample.allValidPairs | mypath/hicpro/HiC-Pro_2.11.4/scripts/build_matrix --matrix-format upper --binsize 1000000 --chrsizes mypath/Homo_sapiens.GRCh38.dna.primary_assembly.fa.fai --ifile /dev/stdin --oprefix hicresults/matrix/mysample/raw/1000000/mysample${bsize}

I got two file .matrix file and .bed file. the .matrix file format is bin_i / bin_j / counts_ij. But I really do not konw which chromosome this bin_i / bin_j belongs to. How to generate a separate interaction matrix file for each chromosome?

Thanks !

nservant commented 3 years ago

Hi The bin coordinates are in the bed file. You can use the split_sparse.py utils to split the matrix per chromosome. best