Nextomics / NextPolish2

Repeat-aware polishing genomes assembled using HiFi long reads
Other
63 stars 3 forks source link

nextpolish2 runs too slow #14

Closed Han-Cao closed 5 months ago

Han-Cao commented 6 months ago

Hi,

I am polishing human genome assembly with ~30X HiFi reads and ~30X paired end illumina reads, using 30 CPU cores with 192GB RAM. The job has run ~15 hours. However, according to your paper, it only takes ~90 to 100 min with 5 CPU cores and 256GB RAM. Is it normal to run the job for such a long time?

Below is the code I used to prepare the input and run nextpolish2. The 15 hour is for nextpolish2 alone.

# minimap2 align to assembly (2.26-r1175)
minimap2 -ax map-hifi -Q -t $threads \
${asmpath}/${sample}.asm.bp.hap1.p_ctg.inspector.fa \
${lrspath}/${sample}.filt.fq.gz | samtools sort -o ${output_prefix}.hap1.bam - 
samtools index -@ $threads ${output_prefix}.hap1.bam

# prepare k-mer count (0.1-r69-dirty)
yak count -o ${output_prefix}.k21.yak -k 21 -b 37 -t $threads \
<(zcat ${srspath}/${sample}_1_paired.fq.gz) <(zcat ${srspath}/${sample}_2_paired.fq.gz) 
yak count -o ${output_prefix}.k31.yak -k 31 -b 37 -t $threads \
<(zcat ${srspath}/${sample}_1_paired.fq.gz) <(zcat ${srspath}/${sample}_2_paired.fq.gz) 

# run nextPolish2 (nextPolish2 0.2.0)
nextPolish2 -t $threads \
${output_prefix}.hap1.bam \
${asmpath}/${sample}.asm.bp.hap1.p_ctg.inspector.fa \
${output_prefix}.k21.yak \
${output_prefix}.k31.yak > ${output_prefix}.asm.bp.hap1.p_ctg.inspector.np2.fa
moold commented 6 months ago

Not normal. Can you use top to check if the cpu time is increasing and use free to check if 192GB is enough?

Han-Cao commented 6 months ago

I just re-run the job with the same code, the output from top and free are:

image

image

moold commented 5 months ago

The screenshot was taken after only 2 minutes of running. It has not yet reached the bottleneck stage, so it is impossible to determine what the problem is. However, you can try to reduce the $threads to below 5 and rerun.

Han-Cao commented 5 months ago

Actually, that was taken after running more than 1 hour. The CPU usage was very low, only consuming 1 single core. I will try reducing to a lower number of threads.

Han-Cao commented 5 months ago

I have tried to used only 5 cores, and it is still very slow. The log from some jobs are:

Real time: 80856 sec; CPU: 204147 sec; Peak RSS: 55.330 GB
Real time: 85194 sec; CPU: 220292 sec; Peak RSS: 56.423 GB
Real time: 67118 sec; CPU: 172853 sec; Peak RSS: 48.021 GB
moold commented 5 months ago

Here is my 4 logs when test to polish the CHM13 genome.

hostname
+ hostname
cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work1
+ cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work1
time /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak > /path/02_np2/05_eva/chm13/asm.fa.polish/asm.k21.31.winnowmap.all.fasta
+ /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
CMD: /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
Real time: 2555 sec; CPU: 11543 sec; Peak RSS: 71.768 GB

real    42m35.498s
user    121m48.883s
sys 70m35.746s
touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work1/subtask.sh.done
+ touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work1/subtask.sh.done

hostname
+ hostname
cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work2
+ cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work2
time /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak > /path/02_np2/05_eva/chm13/asm.fa.polish/asm.k21.31.winnowmap.fasta
+ /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
CMD: /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
Real time: 3822 sec; CPU: 12472 sec; Peak RSS: 73.847 GB

real    63m43.374s
user    128m31.153s
sys 79m21.651s
touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work2/subtask.sh.done
+ touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work2/subtask.sh.done

hostname
+ hostname
cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work3
+ cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work3
time /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak > /path/02_np2/05_eva/chm13/asm.fa.polish/asm.k21.31.minimap2.all.fasta
+ /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
CMD: /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
Real time: 3694 sec; CPU: 12315 sec; Peak RSS: 69.862 GB

real    61m35.498s
user    125m48.392s
sys 79m28.054s
touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work3/subtask.sh.done
+ touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work3/subtask.sh.done

hostname
+ hostname
cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work4
+ cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work4
time /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak > /path/02_np2/05_eva/chm13/asm.fa.polish/asm.k21.31.minimap2.fasta
+ /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
CMD: /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
Real time: 2465 sec; CPU: 11204 sec; Peak RSS: 74.113 GB

real    41m6.182s
user    119m19.423s
sys 67m25.821s
touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work4/subtask.sh.done
+ touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work4/subtask.sh.done

and here are the input files size:

2.7 GB 23-Oct-30 16:24:09 /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa
23 GB 23-Nov-05 20:08:59 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam
23 GB 23-Nov-06 05:29:38 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam
38 GB 23-Nov-01 23:05:00 /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak
35 GB 23-Nov-05 07:19:30 /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak

So you can compare this information with your input and figure out what the problem is? My guess is that your system has a serious IO bottleneck or your input genomes contain many short contigs (<1Mb).

Han-Cao commented 5 months ago

I see, if the benchmark is on CHM13, then it should be due to short contigs. The statistics of my assembly is like:

Statics of contigs:
Number of contigs   865
Number of contigs > 10000 bp    865
Number of contigs >1000000 bp   149
Total length    3024706003
Total length of contigs > 10000 bp  3024706003
Total length of contigs >1000000bp  2914144160
Longest contig  137403523
Second longest contig length    131182353
N50 50325863
N50 of contigs >1Mbp    50325863

Then my question is if the tool is suitable for non-T2T assemblies? Are there any results for polishing hifi-only assemblies? It is really hard and expensive to generate genome assemblies with high quality like HG002 or CHM13

Besides, why your CHM13 is only 2.7G? This is much less than human genome and the CHM13 fasta I have.

moold commented 5 months ago
  1. It is suitable for any assemblies.
  2. The CHM13 file size is 2.7Gb however the genome size is 2951058554.