comparison with overlap alignment ground truth usearch -global?

Hi Graphmap2 team,

Even both graphmap2 and Minimap2 are widely used ideas for overlap detection, I did not see benchmark against a truth but only against other approximate tools (e.g., the most recent loadFast was compared with Minimap2 and graphmap). By saying truth, overlap is essentially semi-global alignment, as implemented in usearch/vsearch for example (open sourced recently). I understand that in usearch/vsearch, gaps extended at both ends are also penalized but with a much smaller penalty score so that the best alignment can still be achieved in the end when searching a database (can choose all versus all comparison without heuristics). My question is, for overlap detection, assume a long read, has many matches in the same fasta file, with semi-global identity ranges from 80% to 100%, overall alignment ratio (overlapped length divided by the length of this read) may also vary (e.g., >50%), the best overlapped hits found by graphmap2/minimap2, ranked by identity to this read, are they consistent for graphmap VS users/vsearch since I believe those 2 are the standards for semi-global alignment, what is the Pearson r and Spearman Rank pho for many such long reads in a sample (say a natural metagenomic long reads sequencing experiments, 100 million long reads).

Thanks,

Jianshu

lbcb-sci / graphmap2

comparison with overlap alignment ground truth usearch -global? #26