phasegenomics / matlock

Simple tools for working with Hi-C data
GNU Affero General Public License v3.0
16 stars 1 forks source link

[matlock bamfilt] --over filtered? #9

Open zhaotao1987 opened 2 years ago

zhaotao1987 commented 2 years ago

Hello, I did reads-mapping and sorting using the following, then I used matlock for the filtering.

## 1. make index file for the assembly.
        bwa index -a bwtsw $assembly
## 2. align hi-c reads
        bwa mem -t $cpu -5SP  $assembly $hic_1 $hic_2 |samblaster | samtools view -@10 -S -h -b -F 2316 > hic_reads.aligned.bam
## 3. sort reads
        samtools sort -@10 -n hic_reads.aligned.bam -o hic_reads.aligned.sorted.bam

filtering:

matlock bamfilt \
-i hic_reads.aligned.sorted.bam \
-o hic_reads.aligned.sorted.cleaned.bam

Then my raw 88G bam now reduced to ~0.9G. I'm very happy at first, also using hic_qc.py it seems the percentages are getting better. But when I used such a result for 3ddna or allhic, it seems didn't work well probably due to very low signal. what would be the reason, could you give some advice, thank you very much! image

image