kundajelab / chip-nexus-pipeline

ChIP-nexus pipeline
MIT License
6 stars 1 forks source link

Incorrect number of reads #2

Open alexandari opened 3 years ago

alexandari commented 3 years ago

The branch cherry-pick-from-atac reports incorrect number of reads.

Using the example data from examples/klab/test_klab.json produces the following QC report http://mitra.stanford.edu/kundaje/amr1/pho4/qc_reports/oct4_nexus.html where the number of total reads from rep1 is 6121

However if you check zcat /oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_1.fastq.gz | wc -l you find 225348672

leepc12 commented 3 years ago

Such number of reads is counted on BAM (not FASTQ). Please count the number of reads from BAM. Run find YOUR_OUTPUT_FOLDER -name "*.bam" on your output folder and run samtools flagstat BAM on that BAM file.

I think it's low quality sample with very poor mapping rate? Did you run it with a correct genome TSV? Please post your input JSON.

alexandari commented 3 years ago

The bam for rep1 contains 34,726,285 reads and the filtered bam for rep1 contains 31,040,339.

The processed results are available here: /oak/stanford/groups/akundaje/amr1/nexus_pipeline/oct4_nexus/

The json is in the repo itself in examples/klab/test_klab.json and I am posting it here for completion:

{
    "chip_nexus.genome_tsv" : "/mnt/data/pipeline_genome_data/genome_tsv/v3/mm10.tsv",
    "chip_nexus.bowtie_idx_tar" : "/oak/stanford/groups/akundaje/avsec/anno/encode/mm10/bowtie_index/mm10_no_alt_analysis_set_ENCODE.tar",
    "chip_nexus.adapter" : "AGATCGGAAGAGCACACGTCTGGATCCACGACGCTCTTCC",
    "chip_nexus.barcodes" : "CTGA,TGAC,GACT,ACTG",
    "chip_nexus.fastqs_rep1_R1" : ["/oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_1.fastq.gz"],
    "chip_nexus.fastqs_rep2_R1" : ["/oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_2.fastq.gz"],
    "chip_nexus.fastqs_rep3_R1" : ["/oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_3.fastq.gz"],
    "chip_nexus.fastqs_rep4_R1" : ["/oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_4.fastq.gz"],
    "chip_nexus.enable_count_signal_track" : true,
    "chip_nexus.title" : "test",
    "chip_nexus.description" : "test"
}