bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
89 stars 18 forks source link

Parallelise assignment #154

Closed johnlees closed 3 years ago

johnlees commented 3 years ago

With large DBs (>10k samples) the BGMM and DBSCAN models become very slow to run the assignment step. It would make sense to use a shm object for the dist matrix, and run parallel blocks of assignment.