marbl / Winnowmap

Long read / genome alignment software
Other
235 stars 22 forks source link

Winnowmap takes much long than Minimap2 - is this expected? #1

Closed jelber2 closed 4 years ago

jelber2 commented 4 years ago

Hi,

I am using winnowmap commit d547331 and it seems like it is taking a very long time (my guess will be about 24 hours) and a lot of RAM (peak 265 GB) for mapping 1000000 simulated PacBio HiFi reads compared to minimap2-2.17 which took about 7 minutes and maybe ~9 GB RAM.

/genetics/elbers/Winnowmap/computeHighFreqKmers 19 1 1024 peregrine.fasta bad_Hk19_mers_peregrine.fasta.txt
minimap2 -t 75 -H -W bad_Hk19_mers_peregrine.fasta.txt -ax asm20 peregrine.fasta random_reads_ccs.fastq.gz > random_reads_ccs.fastq-against-peregrine.fasta.sam
/genetics/elbers/Winnowmap/winnowmap -t 75 -H -W bad_Hk19_mers_peregrine.fasta.txt -ax asm20 peregrine.fasta random_reads_ccs.fastq.gz > random_reads_ccs.fastq-against-peregrine.fasta.sam

The assembly peregrine.fasta is about 450 Mbp, and I was trying to compare the results of one round of Racon polishing with alignments made by minimap2 and winnowmap, but winnowmap seems to be taking quite a long time.

# minimap2 log

[M::mm_idx_gen::15.476*1.84] collected minimizers
[M::mm_idx_gen::17.823*2.88] sorted minimizers
[M::main::17.824*2.88] loaded/built the index for 1100 target sequence(s)
[M::mm_mapopt_update::19.182*2.75] mid_occ = 175
[M::mm_idx_stat] kmer size: 19; skip: 10; is_hpc: 1; #seq: 1100
[M::mm_idx_stat::20.765*2.62] distinct minimizers: 26605237 (71.77% are singletons); average occurrences: 1.995; average spacing: 8.319
[M::worker_pipeline::51.114*15.06] mapped 41606 sequences
[M::worker_pipeline::76.782*19.10] mapped 41608 sequences
[M::worker_pipeline::98.991*22.49] mapped 41605 sequences
[M::worker_pipeline::118.571*26.10] mapped 41594 sequences
[M::worker_pipeline::132.990*28.63] mapped 41595 sequences
[M::worker_pipeline::149.909*30.83] mapped 41612 sequences
[M::worker_pipeline::165.980*33.80] mapped 41612 sequences
[M::worker_pipeline::178.448*35.00] mapped 41601 sequences
[M::worker_pipeline::194.673*35.87] mapped 41605 sequences
[M::worker_pipeline::212.706*37.92] mapped 41615 sequences
[M::worker_pipeline::223.435*38.45] mapped 41612 sequences
[M::worker_pipeline::242.166*40.52] mapped 41603 sequences
[M::worker_pipeline::251.134*40.52] mapped 41601 sequences
[M::worker_pipeline::270.681*42.06] mapped 41598 sequences
[M::worker_pipeline::279.854*42.06] mapped 41611 sequences
[M::worker_pipeline::298.250*43.10] mapped 41604 sequences
[M::worker_pipeline::309.071*43.28] mapped 41589 sequences
[M::worker_pipeline::327.166*44.73] mapped 41618 sequences
[M::worker_pipeline::337.098*44.50] mapped 41618 sequences
[M::worker_pipeline::354.763*45.61] mapped 41602 sequences
[M::worker_pipeline::364.568*45.55] mapped 41601 sequences
[M::worker_pipeline::384.337*46.38] mapped 41624 sequences
[M::worker_pipeline::393.632*46.24] mapped 41599 sequences
[M::worker_pipeline::410.199*46.36] mapped 41604 sequences
[M::worker_pipeline::410.352*46.34] mapped 1463 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -t 75 -H -ax asm20 peregrine.fasta random_reads_ccs.fastq.gz
[M::main] Real time: 410.768 sec; CPU: 19018.148 sec; Peak RSS: 8.641 GB

real    7m0.411s
user    308m36.932s
sys     8m22.508s

# winnowmap log so far

[M::mm_idx_gen::0.000*35196.57] reading downweighted kmers
[M::mm_idx_gen::0.011*460.17] collected downweighted kmers, no. of kmers read=2052
[M::mm_idx_gen::0.011*459.31] saved the kmers in a bloom filter: hash functions=13 and size=39344
[M::mm_idx_gen::38.766*1.34] collected minimizers
[M::mm_idx_gen::41.233*1.80] sorted minimizers
[M::main::41.243*1.80] loaded/built the index for 1100 target sequence(s)
[M::mm_mapopt_update::41.243*1.80] mid_occ = 2147483647
[M::mm_idx_stat] kmer size: 19; skip: 10; is_hpc: 1; #seq: 1100
[M::mm_idx_stat::42.744*1.78] distinct minimizers: 26613163 (71.74% are singletons); average occurrences: 1.956; average spacing: 8.482
[M::worker_pipeline::2989.633*39.61] mapped 41606 sequences
...still going after 1.25 hours (I predict it might take at least 24 hours)

Is the much longer run time expected (obviously, run times would depend on the k-mers, assembly, and reads being used)?

cjain7 commented 4 years ago

Hi,

The asm20 option is not configured with winnowmap as of yet (work in progress). Can you try -ax map-pb option?

jelber2 commented 4 years ago

Yes, I tried the -ax map-pb option, and it is only about twice as long run time

# winnowmap with -ax asm20 setting

[M::mm_idx_gen::0.000*35196.57] reading downweighted kmers
[M::mm_idx_gen::0.011*460.17] collected downweighted kmers, no. of kmers read=2052
[M::mm_idx_gen::0.011*459.31] saved the kmers in a bloom filter: hash functions=13 and size=39344
[M::mm_idx_gen::38.766*1.34] collected minimizers
[M::mm_idx_gen::41.233*1.80] sorted minimizers
[M::main::41.243*1.80] loaded/built the index for 1100 target sequence(s)
[M::mm_mapopt_update::41.243*1.80] mid_occ = 2147483647
[M::mm_idx_stat] kmer size: 19; skip: 10; is_hpc: 1; #seq: 1100
[M::mm_idx_stat::42.744*1.78] distinct minimizers: 26613163 (71.74% are singletons); average occurrences: 1.956; average spacing: 8.482
[M::worker_pipeline::2989.633*39.61] mapped 41606 sequences
[M::worker_pipeline::5960.186*38.36] mapped 41608 sequences
[M::worker_pipeline::8919.402*38.85] mapped 41605 sequences
[M::worker_pipeline::12131.488*37.96] mapped 41594 sequences
[M::worker_pipeline::15060.295*38.38] mapped 41595 sequences
[M::worker_pipeline::18182.216*37.93] mapped 41612 sequences
[M::worker_pipeline::20653.108*38.66] mapped 41612 sequences
[M::worker_pipeline::23789.089*38.45] mapped 41601 sequences
[M::worker_pipeline::26951.418*38.49] mapped 41605 sequences
[M::worker_pipeline::29914.825*38.49] mapped 41615 sequences
[M::worker_pipeline::32652.492*38.75] mapped 41612 sequences
[M::worker_pipeline::35701.262*38.96] mapped 41603 sequences
[M::worker_pipeline::38502.260*39.01] mapped 41601 sequences
[M::worker_pipeline::41522.480*38.90] mapped 41598 sequences
[M::worker_pipeline::44345.592*38.75] mapped 41611 sequences
[M::worker_pipeline::47242.143*38.83] mapped 41604 sequences
[M::worker_pipeline::49809.499*39.17] mapped 41589 sequences
[M::worker_pipeline::52915.746*39.08] mapped 41618 sequences
[M::worker_pipeline::56579.053*38.57] mapped 41618 sequences
[M::worker_pipeline::59157.913*38.81] mapped 41602 sequences
[M::worker_pipeline::61374.922*39.32] mapped 41601 sequences
[M::worker_pipeline::64334.709*39.29] mapped 41624 sequences
[M::worker_pipeline::67499.641*39.17] mapped 41599 sequences
[M::worker_pipeline::70380.777*39.15] mapped 41604 sequences
[M::worker_pipeline::70488.787*39.11] mapped 1463 sequences
[M::main] Version: 1.0
[M::main] CMD: /genetics/elbers/Winnowmap/winnowmap -t 75 -H -W bad_Hk19_mers_peregrine.fasta.txt -ax asm20 peregrine.fasta random_reads_ccs.fastq.gz
[M::main] Real time: 70489.272 sec; CPU: 2757145.812 sec; Peak RSS: 310.024 GB

real    1174m55.381s
user    45012m15.796s
sys 940m10.956s

# minimap2 with -ax map-pb preset instead of -ax asm20

[M::mm_idx_gen::15.607*1.56] collected minimizers
[M::mm_idx_gen::18.018*2.64] sorted minimizers
[M::main::18.019*2.64] loaded/built the index for 1100 target sequence(s)
[M::mm_mapopt_update::18.952*2.56] mid_occ = 175
[M::mm_idx_stat] kmer size: 19; skip: 10; is_hpc: 1; #seq: 1100
[M::mm_idx_stat::19.615*2.51] distinct minimizers: 26605237 (71.77% are singletons); average occurrences: 1.995; average spacing: 8.319
[M::worker_pipeline::46.379*16.42] mapped 41606 sequences
[M::worker_pipeline::67.743*21.20] mapped 41608 sequences
[M::worker_pipeline::87.438*24.76] mapped 41605 sequences
[M::worker_pipeline::104.838*28.35] mapped 41594 sequences
[M::worker_pipeline::120.227*30.26] mapped 41595 sequences
[M::worker_pipeline::137.584*32.58] mapped 41612 sequences
[M::worker_pipeline::151.957*35.20] mapped 41612 sequences
[M::worker_pipeline::163.811*36.32] mapped 41601 sequences
[M::worker_pipeline::180.690*36.94] mapped 41605 sequences
[M::worker_pipeline::198.201*38.00] mapped 41615 sequences
[M::worker_pipeline::211.680*38.99] mapped 41612 sequences
[M::worker_pipeline::226.701*40.66] mapped 41603 sequences
[M::worker_pipeline::238.456*40.91] mapped 41601 sequences
[M::worker_pipeline::257.109*42.73] mapped 41598 sequences
[M::worker_pipeline::265.445*42.56] mapped 41611 sequences
[M::worker_pipeline::284.811*43.21] mapped 41604 sequences
[M::worker_pipeline::296.548*43.28] mapped 41589 sequences
[M::worker_pipeline::319.236*44.32] mapped 41618 sequences
[M::worker_pipeline::326.389*43.94] mapped 41618 sequences
[M::worker_pipeline::348.654*44.57] mapped 41602 sequences
[M::worker_pipeline::357.071*44.47] mapped 41601 sequences
[M::worker_pipeline::378.184*45.38] mapped 41624 sequences
[M::worker_pipeline::385.524*45.24] mapped 41599 sequences
[M::worker_pipeline::406.795*44.81] mapped 41604 sequences
[M::worker_pipeline::406.952*44.80] mapped 1463 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -t 75 -H -ax map-pb peregrine.fasta random_reads_ccs.fastq.gz
[M::main] Real time: 407.409 sec; CPU: 18230.212 sec; Peak RSS: 8.474 GB

real    6m48.695s
user    295m54.068s
sys 7m56.748s

# winnowmap with -ax map-pb preset instead of -ax asm20

[M::mm_idx_gen::0.000*44845.67] reading downweighted kmers
[M::mm_idx_gen::0.002*2623.30] collected downweighted kmers, no. of kmers read=2052
[M::mm_idx_gen::0.002*2608.78] saved the kmers in a bloom filter: hash functions=13 and size=39344 
[M::mm_idx_gen::35.847*1.29] collected minimizers
[M::mm_idx_gen::36.419*1.36] sorted minimizers
[M::main::36.419*1.36] loaded/built the index for 1100 target sequence(s)
[M::mm_mapopt_update::36.419*1.36] mid_occ = 2147483647
[M::mm_idx_stat] kmer size: 19; skip: 50; is_hpc: 1; #seq: 1100
[M::mm_idx_stat::36.532*1.36] distinct minimizers: 5987044 (72.90% are singletons); average occurrences: 1.882; average spacing: 39.197
[M::worker_pipeline::90.264*22.64] mapped 41606 sequences
[M::worker_pipeline::118.692*28.46] mapped 41608 sequences
[M::worker_pipeline::146.177*32.43] mapped 41605 sequences
[M::worker_pipeline::169.797*35.69] mapped 41594 sequences
[M::worker_pipeline::201.795*36.95] mapped 41595 sequences
[M::worker_pipeline::235.177*39.00] mapped 41612 sequences
[M::worker_pipeline::255.899*39.42] mapped 41612 sequences
[M::worker_pipeline::282.837*40.66] mapped 41601 sequences
[M::worker_pipeline::320.715*41.04] mapped 41605 sequences
[M::worker_pipeline::352.374*41.33] mapped 41615 sequences
[M::worker_pipeline::398.158*41.56] mapped 41612 sequences
[M::worker_pipeline::419.895*41.48] mapped 41603 sequences
[M::worker_pipeline::442.267*41.81] mapped 41601 sequences
[M::worker_pipeline::468.907*42.44] mapped 41598 sequences
[M::worker_pipeline::504.165*41.78] mapped 41611 sequences
[M::worker_pipeline::530.671*42.27] mapped 41604 sequences
[M::worker_pipeline::553.943*43.12] mapped 41589 sequences
[M::worker_pipeline::583.634*43.36] mapped 41618 sequences
[M::worker_pipeline::616.610*43.01] mapped 41618 sequences
[M::worker_pipeline::642.092*43.44] mapped 41602 sequences
[M::worker_pipeline::665.499*44.15] mapped 41601 sequences
[M::worker_pipeline::692.943*44.35] mapped 41624 sequences
[M::worker_pipeline::718.479*44.64] mapped 41599 sequences
[M::worker_pipeline::745.373*44.27] mapped 41604 sequences
[M::worker_pipeline::746.044*44.23] mapped 1463 sequences
[M::main] Version: 1.0
[M::main] CMD: /genetics/elbers/Winnowmap/winnowmap -t 75 -H -W bad_Hk19_mers_peregrine.fasta.txt -ax map-pb peregrine.fasta random_reads_ccs.fastq.gz
[M::main] Real time: 746.350 sec; CPU: 33001.520 sec; Peak RSS: 15.715 GB

real    12m33.958s
user    538m8.556s
sys 11m53.684s

I'll run QUAST-LG and post the differences in another post once that finishes running.

jelber2 commented 4 years ago
# minimap2 2.17-r941
# winnowmap 1.0 (Github commit d547331 or ee6bca7 - indicated below)
# samtools 1.9
# racon 1.4.11 (Github commit 6ca733a)
# quast 5.0.2 (conda install [Name:quast Version:5.0.2 Build:py36pl526ha92aebf_0 Channel:bioconda])

time minimap2 -t 75 -H -ax asm20 peregrine.fasta random_reads_ccs.fastq.gz > random_reads_ccs.fastq-against-peregrine.fasta.sam
samtools view -@75 -F 1796 -q 20 random_reads_ccs.fastq-against-peregrine.fasta.sam > random_reads_ccs.fastq-against-peregrine.fasta.filtered.sam
racon -u -t 75 random_reads_ccs.fastq.gz random_reads_ccs.fastq-against-peregrine.fasta.filtered.sam peregrine.fasta 2> peregrine.fasta_racon_polished.fasta.log > peregrine.fasta_racon_polished_minimap2_asm20.alignments.fasta

time /genetics/elbers/Winnowmap/winnowmap -t 75 -H -W bad_Hk19_mers_peregrine.fasta.txt -ax asm20 peregrine.fasta random_reads_ccs.fastq.gz > random_reads_ccs.fastq-against-peregrine.fasta.sam
samtools view -@75 -F 1796 -q 20 random_reads_ccs.fastq-against-peregrine.fasta.sam > random_reads_ccs.fastq-against-peregrine.fasta.filtered.sam
racon -u -t 75 random_reads_ccs.fastq.gz random_reads_ccs.fastq-against-peregrine.fasta.filtered.sam peregrine.fasta 2> peregrine.fasta_racon_polished.fasta.log > peregrine.fasta_racon_polished_winnowmap_asm20.alignments.fasta

time minimap2 -t 75 -H -ax map-pb peregrine.fasta random_reads_ccs.fastq.gz > random_reads_ccs.fastq-against-peregrine.fasta.sam
samtools view -@75 -F 1796 -q 20 random_reads_ccs.fastq-against-peregrine.fasta.sam > random_reads_ccs.fastq-against-peregrine.fasta.filtered.sam
racon -u -t 75 random_reads_ccs.fastq.gz random_reads_ccs.fastq-against-peregrine.fasta.filtered.sam peregrine.fasta 2> peregrine.fasta_racon_polished.fasta.log > peregrine.fasta_racon_polished_minimap2_map-pb.alignments.fasta

time /genetics/elbers/Winnowmap/winnowmap -t 75 -H -W bad_Hk19_mers_peregrine.fasta.txt -ax map-pb peregrine.fasta random_reads_ccs.fastq.gz > random_reads_ccs.fastq-against-peregrine.fasta.sam
samtools view -@75 -F 1796 -q 20 random_reads_ccs.fastq-against-peregrine.fasta.sam > random_reads_ccs.fastq-against-peregrine.fasta.filtered.sam
racon -u -t 75 random_reads_ccs.fastq.gz random_reads_ccs.fastq-against-peregrine.fasta.filtered.sam peregrine.fasta 2> peregrine.fasta_racon_polished.fasta.log > peregrine.fasta_racon_polished_winnowmap_map-pb.alignments.fasta

quast.py --fragmented --fast --large --threads 75 -r curated.fasta --eukaryote -o ./quast_ccs peregrine.fasta peregrine.fasta_racon_polished_winnowmap_*.alignments.fasta peregrine.fasta_racon_polished_minimap2_*.alignments.fasta > ./quast_ccs.log 2>&1

Note the purpose of this "test" was to see how alignment with Minimap2 or Winnowmap influenced polishing with RACON based on substitution and indel errors compared to the reference with QUAST-LG

#QUAST-LG results
Assembly                                                            # contigs (>= 0 bp)  # contigs (>= 1000 bp)  # contigs (>= 5000 bp)  # contigs (>= 10000 bp)  # contigs (>= 25000 bp)  # contigs (>= 50000 bp)  Total length (>= 0 bp)  Total length (>= 1000 bp)  Total length (>= 5000 bp)  Total length (>= 10000 bp)  Total length (>= 25000 bp)  Total length (>= 50000 bp)  # contigs  Largest contig  Total length  Reference length  Reference GC (%)  N50     NG50    N75     NG75    L50  LG50  L75  LG75  # misassemblies  # misassembled contigs  Misassembled contigs length  # local misassemblies  # scaffold gap ext. mis.  # scaffold gap loc. mis.  # possible TEs  # unaligned mis. contigs  # unaligned contigs  Unaligned length  Genome fraction (%)  Duplication ratio  # N's per 100 kbp  # mismatches per 100 kbp  # indels per 100 kbp  Largest alignment  Total aligned length  NA50    NGA50   NA75    NGA75   LA50  LGA50  LA75  LGA75
peregrine.fasta                                                     1100                 1100                    1100                    1100                     970                      839                      441579675               441579675                  441579675                  441579675                   439306196                   434374226                   1100       5873009         441579675     445568504         27.74             822917  817504  443020  432130  145  148   324  331   206              164                     114872736                    104                    0                         0                         0               4                         3 + 98 part          284887            98.681               1.004              0.00               8.30                      8.60                  5462309            440705701             741321  737582  383244  379607  166   169    367   375
peregrine.fasta_racon_polished_minimap2_asm20.alignments.fasta      1100                 1100                    1100                    1100                     971                      839                      441602853               441602853                  441602853                  441602853                   439352792                   434395087                   1100       5873659         441602853     445568504         27.74             822925  817547  443009  432131  145  148   324  331   205              163                     117139196                    73                     0                         0                         0               2                         2 + 41 part          135249            98.727               1.004              0.00               3.52                      5.97                  5462789            440969958             740285  734066  383626  379627  167   169    368   376
peregrine.fasta_racon_polished_winnowmap_asm20.alignments.fasta     1100                 1100                    1100                    1100                     971                      839                      441684119               441684119                  441684119                  441684119                   439428392                   434468379                   1100       5873685         441684119     445568504         27.74             822925  817549  443009  432346  145  148   324  331   201              160                     113316841                    100                    0                         0                         2               5                         4 + 59 part          277256            98.714               1.004              0.00               4.06                      6.50                  5463573            440896765             740285  734066  383626  379641  167   169    368   376
peregrine.fasta_racon_polished_winnowmap_ee6bca7_asm20.alignments   1100                 1100                    1100                    1100                     971                      839                      441602294               441602294                  441602294                  441602294                   439351947                   434393955                   1100       5873683         441602294     445568504         27.74             822925  817547  442995  432131  145  148   324  331   203              161                     113250940                    65                     0                         0                         0               6                         2 + 40 part          186994            98.722               1.004              0.00               3.46                      5.80                  5462763            440925028             740288  734066  383633  379630  167   169    368   376
peregrine.fasta_racon_polished_minimap2_map_pb.alignments.fasta     1100                 1100                    1100                    1100                     971                      839                      441604293               441604293                  441604293                  441604293                   439354082                   434396590                   1100       5873655         441604293     445568504         27.74             822925  817546  442986  432137  145  148   324  331   204              162                     116822205                    82                     0                         0                         0               3                         2 + 44 part          158859            98.722               1.004              0.00               3.57                      6.02                  5462788            440923550             740293  734066  383625  379628  167   169    368   376
peregrine.fasta_racon_polished_winnowmap_map_pb.alignments.fasta    1100                 1100                    1100                    1100                     971                      839                      441606529               441606529                  441606529                  441606529                   439355783                   434397824                   1100       5873669         441606529     445568504         27.74             822925  817546  442986  432137  145  148   324  331   202              161                     114327891                    80                     0                         0                         0               8                         3 + 48 part          243962            98.720               1.003              0.00               3.58                      5.84                  5462776            440852929             740296  734066  383625  379633  167   169    368   376
peregrine.fasta_racon_polished_winnowmap_ee6bca7_map_pb.alignments  1100                 1100                    1100                    1100                     971                      839                      441606475               441606475                  441606475                  441606475                   439354893                   434396979                   1100       5873665         441606475     445568504         27.74             822925  817546  442986  432137  145  148   324  331   201              159                     113224872                    74                     0                         0                         0               5                         4 + 47 part          226386            98.719               1.003              0.00               3.53                      5.83                  5462623            440875557             740296  734066  383625  379633  167   169    368   376

Okay, so the results are actually quite similar. Regardless of the mapper (minimap2 or winnowmap) or preset (asm20 or map-pb), each combination of mapper and preset seems to have its pros and cons. It will be interesting to see the results again once the asm20 setting for winnowmap is configured properly.

EDIT: So with ee6bca7 (table updated above), winnowmap with asm20 option seems to be better in terms of indels and mismatches per 100 kbp.

cjain7 commented 4 years ago

Thanks for sharing these results.. I expect Winnowmap to deliver better mappings if there are long repetitive regions or low-complexity regions in the reference. Otherwise, both should give similar results.

jelber2 commented 4 years ago

You are welcome. I get similar results in terms of "quality values" (#mismatches per 100 kbp converted to Phred scores) when I use yak with simulated short-reads on the original reference as the k-mer database against the different assemblies I used with the QUAST-LG analysis.

jelber2 commented 4 years ago

Have the asm20 asm10 or asm5 settings been enabled/optimized with the new commits?

cjain7 commented 4 years ago

I've made a few changes, and enabled settings for asm* presets.

But for CCS reads, I recommend you to stick with map-pb preset. It uses homopolymer compressed k-mers (k=19). See PacBio / HiFi settings listed in the updated README file.

Please give it a try at your end. Again, I don't expect performance diffs unless the ref has long complex repeats.

jelber2 commented 4 years ago

I updated https://github.com/marbl/Winnowmap/issues/1#issuecomment-594388270 to include updated results from commit ee6bca7

jelber2 commented 4 years ago

Results from yak when I use yak with simulated short-reads from the original reference using to simulate the CCS/HiFi reads

/genetics/elbers/yak/yak count -b37 -t75 -o sr.yak \
<(bgzip -@75 -cd small_insert_trim1.fq.gz) \
<(bgzip -@75 -cd small_insert_trim2.fq.gz) > yak.log 2>&1 &

for ref in `ls -1 *.fasta.gz`;do
/genetics/elbers/yak/yak qv -t75 sr.yak ${ref} > ${ref}-sr.qv.txt &
done

## explanation of values shown in tail command below
==> assembly-sr.qv.txt <==
QV  raw_quality_value   adjusted_quality_value

tail -n 1 *-sr.qv.txt

==> peregrine.fasta.gz-sr.qv.txt <==
QV  43.759  43.806

==> peregrine.fasta_racon_polished_minimap2_asm20.alignments.fasta.gz-sr.qv.txt <==
QV  43.901  43.936

==> peregrine.fasta_racon_polished_minimap2_map-pb.alignments.fasta.gz-sr.qv.txt <==
QV  43.812  43.848

==> peregrine.fasta_racon_polished_winnowmap_asm20.alignments.fasta.gz-sr.qv.txt <==
QV  43.497  43.543

==> peregrine.fasta_racon_polished_winnowmap-ee6bca7_asm20.alignments.fasta.gz-sr.qv.txt <==
QV  43.827  43.867

==> peregrine.fasta_racon_polished_winnowmap-ee6bca7_map-pb.alignments.fasta.gz-sr.qv.txt <==
QV  43.698  43.740

==> peregrine.fasta_racon_polished_winnowmap_map-pb.alignments.fasta.gz-sr.qv.txt <==
QV  43.701  43.747

Remember that the QUAST error rates are based on alignment whilst the yak method is alignment-free/k-mer based.

I must say BRAVO!!!