Open ld9866 opened 3 years ago
I've had this issue before, but it's hard to say exactly what's happening without more details. Make sure that
pbcstat *_mapping.paf.gz
is generating a file that looks a bit like a histogram with coverage numbers from 1 to 500. When pbcstat wasn't reading my alignment files it just had 0s in the second column and couldn't calculate cutoffs anyway, yielding 0 peaks
.
How did you map the reads to your assembly?
Hi, reviving this old thread as I have the same issue. I am running this using a primary assembly from Hifiasm (*p_ctg.fa) and a single file of PacBio Hifi reads (fastq). clacuts.log gives "[M::calcuts] Find 0 peaks" cutoffs is empty PB.stat has two columns, [1:500] and [all zeros] No hist plot is saved Step1-3 pipeline runs to the end and gives purge_dups.log showing:
[M::main] finish parsing params
[M::main] finish reading hits
OVLP: ptg000001l:1-5547361 4611277 4717776 0 ptg000010l:1-3002607 569202 623925 0
OVLP: ptg000001l:1-5547361 4640442 4766662 0 ptg000009l:1-17533427 13951853 14145375 0
OVLP: ptg000005l:1-12831892 7381227 7414793 0 ptg000009l:1-17533427 6550726 6569439 0
hap.fa is small but not zero, around 414 KB (leaving purged.fa to be the remainder at ~92MB as expected).
Here's a summary of what I'm running:
# Step1, align
minimap2 -t 16 -x map-hifi -a "assemblyPATH" "readsPATH" | gzip -c - > align.paf.gz ## seems to work, align.paf.gz is ~2GB
### I have also tested -x asm20, to match how the purge_dups docs suggest handling CCS reads
### minimap2 v. 2.24 used
# running PBstats
"MYPATH/bin/pbcstat" align.paf.gz
# output:
Program starts
[M::aa_pb] collecting positions from paf file
[M::aa_pb] calculating coverage for each base on genome
[M::aa_pb] print coverage histogram for the contigs
[M::aa_pb] print coverage for each base of the contigs
[M::aa_pb] print average coverage for each 1024 base of the contigs
[M::aa_pb] release coverage vector space
[M::aa_pb] release sequence vector space
Program finished successfully
# running Calcuts
"MYPATH/bin/calcuts" PB.stat > cutoffs 2>calcults.log
#### not showing the rest of pipeline ####
Any idea of why PBstat is struggling to handle my paf.gz?
Thanks!
Answer: I was dumb and included option "-a" with minimap which makes a .sam. Calling up minimap without this option as in the docs solves it.
Hello! How can l solve this problem? calcuts PB.stat > cutoffs 2>calcults.log [M::calcuts] Find 0 peaks