dib-lab / sourmash-slainte

Project template for sourmash-based characterization of genomes and metagenomes
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

EXP: switch to using calc-full-gather.py #18

Open ctb opened 7 months ago

ctb commented 7 months ago

This PR switches slainte over to using calc-full-gather.py from https://github.com/ctb/2024-calc-full-gather / https://github.com/sourmash-bio/sourmash_plugin_branchwater/issues/187, which does not run a whole new gather with a picklist, but instead calculates the columns starting from the fastgather output.

This has the advantage of being lower memory and faster, per https://github.com/sourmash-bio/sourmash/issues/2950. This is especially true for large nasty rumen samples, ugh.

Before this gets merged, we would need to fix calc-full-gather to work with multiple databases, among perhaps other things.

ctb commented 7 months ago

This PR also triggered https://github.com/sourmash-bio/sourmash/pull/2952 :)