aryeelab / hichipper

A preprocessing and QC pipeline for HiChIP data
MIT License
33 stars 12 forks source link

Inflated alignment stats #15

Open caleblareau opened 7 years ago

caleblareau commented 7 years ago

Can get over 100% mapping from current bowtie2 parse

smza commented 6 years ago

Hi, We have run Hichipper on the HiC-pro outfiles and find that the number of Mapped_unique_quality reads reported is higher than Total_PETs. How is that possible? Are we missing something here. I am attaching one of the stat files as example. example.stat.txt

Thanks

caleblareau commented 6 years ago

The # of mapped unique quality interactions is derived from this:

cat "${WK_DIR}/${HICPRO_OUT}/hic_results/data/${SAMPLE}/"*Pairs | wc -l | awk '{print $1}'

What version of HiC-Pro are you using?

smza commented 6 years ago

Thanks for your prompt response. We are using HiC-Pro 2.10.0

The problem was that hi-chipper is counting the valid pairs files twice. hic_results folder contains the following 2 files where file 2 is a subset of file 1:

  1. sample1_trim_genome.bwt2pairs.validPairs

  2. sample1_allValidPairs

So ideally it should be using only file2(sample1_allValidPairs) for counting, which is the deduplicated interaction file, isn't it?

I would like a to ask a couple of more questions as follows:

  1. we would to like to know what should be the threshold for “% of fraction of total reads that are in loops" to be considered a good quality ChIP. We have seen in your

H3K27ac data the % is ~1%. Will this be considered a good or a bad quality chip efficiency or is it dependent on the antibody efficiency and other experimental factors?

  1. How do you evaluate the biological and/or technical replicate consistency? Do you have any QC metrics for doing so?

Many thanks.

Munazah

On Fri, May 4, 2018 at 4:22 PM, Caleb Lareau notifications@github.com wrote:

The # of mapped unique quality interactions is derived from this:

cat "${WK_DIR}/${HICPRO_OUT}/hic_results/data/${SAMPLE}/"*Pairs | wc -l | awk '{print $1}'

What version of HiC-Pro are you using?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/aryeelab/hichipper/issues/15#issuecomment-386633036, or mute the thread https://github.com/notifications/unsubscribe-auth/ATSwicog0OlkwL-RmBfxMN-uegMYObhmks5tvHIwgaJpZM4Nhm65 .