NVIDIA-Genomics-Research / GenomeWorks

SDK for GPU accelerated genome assembly and analysis
https://clara-parabricks.github.io/GenomeWorks/
Apache License 2.0
286 stars 76 forks source link

[cudapoa] improve estimation of maximum graph length #442

Closed r-mafi closed 3 years ago

r-mafi commented 4 years ago

POA graph length depends on the differences between multiple sequences in a window. As this difference can potentially grow by increasing the number of sequences in a window, we can modify heuristics to estimate the maximum graph length in BatchSize constructor to take this parameter into account. Using a fixed formula for maximum graph length can be wasteful in some cases and insufficient for some others.

r-mafi commented 3 years ago

Since this size depends on the input, difference between sequences in addition to the number of sequences, it is not possible to find a heuristic that works in all cases, but using BatchConfig() constructor now we can control maximum graph length and it is not a fixed formula.