Open alexandari opened 3 years ago
Such number of reads is counted on BAM (not FASTQ).
Please count the number of reads from BAM.
Run find YOUR_OUTPUT_FOLDER -name "*.bam"
on your output folder and run samtools flagstat BAM
on that BAM file.
I think it's low quality sample with very poor mapping rate? Did you run it with a correct genome TSV? Please post your input JSON.
The bam for rep1 contains 34,726,285 reads and the filtered bam for rep1 contains 31,040,339.
The processed results are available here: /oak/stanford/groups/akundaje/amr1/nexus_pipeline/oct4_nexus/
The json is in the repo itself in examples/klab/test_klab.json and I am posting it here for completion:
{
"chip_nexus.genome_tsv" : "/mnt/data/pipeline_genome_data/genome_tsv/v3/mm10.tsv",
"chip_nexus.bowtie_idx_tar" : "/oak/stanford/groups/akundaje/avsec/anno/encode/mm10/bowtie_index/mm10_no_alt_analysis_set_ENCODE.tar",
"chip_nexus.adapter" : "AGATCGGAAGAGCACACGTCTGGATCCACGACGCTCTTCC",
"chip_nexus.barcodes" : "CTGA,TGAC,GACT,ACTG",
"chip_nexus.fastqs_rep1_R1" : ["/oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_1.fastq.gz"],
"chip_nexus.fastqs_rep2_R1" : ["/oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_2.fastq.gz"],
"chip_nexus.fastqs_rep3_R1" : ["/oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_3.fastq.gz"],
"chip_nexus.fastqs_rep4_R1" : ["/oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_4.fastq.gz"],
"chip_nexus.enable_count_signal_track" : true,
"chip_nexus.title" : "test",
"chip_nexus.description" : "test"
}
The branch cherry-pick-from-atac reports incorrect number of reads.
Using the example data from examples/klab/test_klab.json produces the following QC report http://mitra.stanford.edu/kundaje/amr1/pho4/qc_reports/oct4_nexus.html where the number of total reads from rep1 is 6121
However if you check zcat /oak/stanford/groups/akundaje/amr1/bpnet/julia-lab-chipnexus/mesc_oct4_nexus_1.fastq.gz | wc -l you find 225348672