chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
522 stars 86 forks source link

het peak -1 and second peak on kmer distribution plot missing #697

Open desa-la opened 2 weeks ago

desa-la commented 2 weeks ago

Hi, I have issue similar to #388 #49 #563 , but different in one point. Namely, I am working with diploid plant genome. I have several individuals sequenced with Pacbio hifi. When assembled with hifiasm hifiasm -o $RES/F32.bam2fq.asm -t 32 $DIR/F32.bam2fq.fastq most of them generate correct looking kmer distribution (look log.file.correct.looking.F32.txt), but for few individuals I have heterozygous peak at -1. Now the difference from the previous issue with het peak -1 is that I don’t see the second peak in the distribution plot either (look at log.file.wrong.looking.H15.txt). I was thinking about --min-hist-cnt but since default is 5, specifying this would not help my case since the highest peak is at 2 in my log.file.wrong.looking.H15.txt, if I am understanding this option correctly?

These samples are not special in anything. Coverage does differ from 60x to 110x, but I don’t see the pattern between correct and wrong looking ones in connection to coverage, meaning I have both coverage in bot wrong and correct looking categories.

What could cause this, how should I go about trouble shouting here, do you have some idea? And how does this actually affect the assembly since hifiasm still finishes without an error and produces an output. In one thread you wrote that het peak often does not affect the final results, but is that the case as well when there is no second peak at all in the distribution like in my case?

log.file.correct.looking.F32.txt log.file.wrong.looking.H15.txt

Samyuktha9624 commented 2 weeks ago

Hello, totally irrelevant to your question. Sorry about that. I am using hifiasm for the first time, where do I find these hifiasm log files?

desa-la commented 1 week ago

Hi @Samyuktha9624, this log file goes into stdout, default output stream of the program, so what gets printed on the screen while the job is running. I am submitting it as a slurm job and then it is directed into my output file. My header looks like this:

#!/bin/bash
#SBATCH -o outfile_hifiasm-%J
#SBATCH -e errfile_hifiasm-%J
#SBATCH -n 20
#SBATCH --mem=100GB
#SBATCH --mail-type=BEGIN
#SBATCH --mail-type=END
#SBATCH --mail-user=xxx

and here the log file goes into outfile_hifiasm_jobid file. Hope that helps, Desa

Samyuktha9624 commented 1 week ago

Thank you so much, Desa!