wwood / galah

More scalable dereplication for metagenome assembled genomes
GNU General Public License v3.0
48 stars 11 forks source link

skani is not parallelized in `galah cluster` #36

Closed apcamargo closed 1 year ago

apcamargo commented 1 year ago

I'm trying out the latest Galah commit and I noticed that is is using a single thread to cluster genomes, even though I set --threads 48. Is this by design?

wwood commented 1 year ago

Hi,

Good point - and a reminder to fix this - @AroneyS reported this offline a few weeks back.

Have implemented a fix but it could do with some more real world testing - does it work for you? Thanks.

wwood commented 1 year ago

Looks like it might be a bit slow at the moment because it repeatedly recalculates sketches.