wwood / galah

More scalable dereplication for metagenome assembled genomes
GNU General Public License v3.0
48 stars 11 forks source link

Skani preclusterer speed-up #37

Closed AroneyS closed 12 months ago

AroneyS commented 1 year ago
wwood commented 1 year ago

Hey so this is good to go you reckon?

AroneyS commented 1 year ago

Preclustering the 36k genomes from before takes 15mins now (down from 6 hours). Yay!

But, currently skani+skani will run skani twice. Just need to add a check for when preclusterer=clusterer. Also, we have the triangle implementation in skani preclusterer, so it gets the more permissive ANI cutoff. How should we handle that? If preclusterer=clusterer then pass final ANI cutoff to preclusterer and skip clusterer?

wwood commented 1 year ago

Great. That sounds reasonable.

-------------- Ben Woodcroft Group leader, Centre for Microbiome Research, QUT


From: Samuel Aroney @.> Sent: Thursday, November 23, 2023 8:17:58 AM To: wwood/galah @.> Cc: Ben J Woodcroft @.>; Comment @.> Subject: Re: [wwood/galah] Skani preclusterer speed-up (PR #37)

Preclustering the 36k genomes from before takes 15mins now (down from 6 hours). Yay!

But, currently skani+skani will run skani twice. Just need to add a check for when preclusterer=clusterer. Also, we have the triangle implementation in skani preclusterer, so it gets the more permissive ANI cutoff. How should we handle that? If preclusterer=clusterer then pass final ANI cutoff to preclusterer and skip clusterer?

― Reply to this email directly, view it on GitHubhttps://github.com/wwood/galah/pull/37#issuecomment-1823576796, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAADX5GZW3UNSFG5RHTJNN3YFZ2ZNAVCNFSM6AAAAAA7VPIZUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRTGU3TMNZZGY. You are receiving this because you commented.Message ID: @.***>

wwood commented 12 months ago

ok if you are?

AroneyS commented 12 months ago

[like] Sam Aroney reacted to your message:


From: Ben J Woodcroft @.> Sent: Saturday, November 25, 2023 5:45:34 AM To: wwood/galah @.> Cc: Sam Aroney @.>; Author @.> Subject: Re: [wwood/galah] Skani preclusterer speed-up (PR #37)

ok if you are?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/wwood/galah/pull/37*issuecomment-1826223843__;Iw!!NVzLfOphnbDXSw!Amaen6z2Yy1LHwo4wIXh5O65H2xPv0yRnxgiNAeNwb0boPzn1_N2hlULqywoprvzqaXTN6Yc7NW9VcvdO9gTJTA-TeENErU$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AJZYIJ5FG6ITZJLPD6ZP7G3YGGAX5AVCNFSM6AAAAAA7VPIZUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWGIZDGOBUGM__;!!NVzLfOphnbDXSw!Amaen6z2Yy1LHwo4wIXh5O65H2xPv0yRnxgiNAeNwb0boPzn1_N2hlULqywoprvzqaXTN6Yc7NW9VcvdO9gTJTA-fK75SQ8$. You are receiving this because you authored the thread.Message ID: @.***>

AroneyS commented 12 months ago

[0] Sam Aroney reacted to your message:


From: Ben J Woodcroft @.> Sent: Saturday, November 25, 2023 5:45:34 AM To: wwood/galah @.> Cc: Sam Aroney @.>; Author @.> Subject: Re: [wwood/galah] Skani preclusterer speed-up (PR #37)

ok if you are?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/wwood/galah/pull/37*issuecomment-1826223843__;Iw!!NVzLfOphnbDXSw!Amaen6z2Yy1LHwo4wIXh5O65H2xPv0yRnxgiNAeNwb0boPzn1_N2hlULqywoprvzqaXTN6Yc7NW9VcvdO9gTJTA-TeENErU$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AJZYIJ5FG6ITZJLPD6ZP7G3YGGAX5AVCNFSM6AAAAAA7VPIZUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWGIZDGOBUGM__;!!NVzLfOphnbDXSw!Amaen6z2Yy1LHwo4wIXh5O65H2xPv0yRnxgiNAeNwb0boPzn1_N2hlULqywoprvzqaXTN6Yc7NW9VcvdO9gTJTA-fK75SQ8$. You are receiving this because you authored the thread.Message ID: @.***>