bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
88 stars 18 forks source link

Move the extend algorithm into the C++ extension #178

Closed johnlees closed 3 years ago

johnlees commented 3 years ago

This moves LineageFit.extend to the poppunk_refine extension. The algorithm is similar to that used in sparsifyDists in pp_sketchlib, but works specifically with a combination of sparse and dense matrices. The highest rank is computed, and then lowered to get smaller ranks. Parallelisation supported.

johnlees commented 3 years ago

Timing of extend on 10k ref vs 10k query: 1 thread: 19s 2 threads: 13s 4 threads: 8s 8 threads: 4s

johnlees commented 3 years ago

Only one small comment - might be slightly clearer to have new_val be true when there is a substantial increment in the distance, but I appreciate it is of limited importance to the mechanics of the functions where it appears.

This is a good point, and though it works correctly, I've already confused myself once because the name suggests the opposite meaning to what it should be