NVIDIA-Genomics-Research / GenomeWorks

SDK for GPU accelerated genome assembly and analysis
https://clara-parabricks.github.io/GenomeWorks/
Apache License 2.0
281 stars 76 forks source link

[cudamapper] Do not do self-mapping of reads #531

Open mimaric opened 4 years ago

mimaric commented 4 years ago

When doing all-vs-all do not look for anchors of reads with the same read_id. Currently we are doing this for the sake of code simplicity, but this introduces additional overlaps which probably affect accuracy and definitely affect performance by creating additional anchors which take up more space and require time to be processed.

There are two options: 1) Make Matcher not look for anchors with the same read_id 2) If 1) turns out to be too complicated simply skip matching same indices. This would lead to all read pairs from those indices to be skipped, so option 1) is preferred

mimaric commented 4 years ago

@tijyojwad @vellamike FYI