yezhengSTAT / mHiC

MIT License
22 stars 10 forks source link

Question about output file #10

Open aakashsur opened 5 years ago

aakashsur commented 5 years ago

So after step 6, is there a way to go from the output file:

HWI-ST279:283:D1ACDACXX:8:1101:1345:16298       chr11   1955000 chr4    1155000 0.04448364631169624
HWI-ST279:283:D1ACDACXX:8:1101:1345:16298       chr11   45000   chr11   1955000 0.5951475510998886
HWI-ST279:283:D1ACDACXX:8:1101:1345:16298       chr7    35000   chr11   1955000 0.36036880258841525
HWI-ST279:283:D1ACDACXX:8:1101:1345:54161       chr8    455000  chr4    945000  0.2559876018501595
HWI-ST279:283:D1ACDACXX:8:1101:1345:54161       chr8    455000  chr4    965000  0.7440123981498405

To a count format where the multi-mapping reads have been assigned to the highest probability location?

yezhengSTAT commented 5 years ago

Glad that you have finished all the steps in mHiC. Yes, after mHi-C processing, you will need to do a filtering based on your own needs. We recommend filter the probability column (6th) column by >0.5 so that each multi-read can have at most one selected alignment position. You can also do >0.6 or >0.9 for more stringent filtering. Then merge with your uni-reads bin-pair count file and get the interaction counts for each bin-pair. You will need to write your own code but it can always be done by a few shell commands. I can add a few recommended commands to the manual later.