rhysnewell / aviary

A hybrid assembly and MAG recovery pipeline (and more!)
GNU General Public License v3.0
76 stars 11 forks source link

independent singlem runs #181

Closed wwood closed 4 months ago

wwood commented 7 months ago

Hey,

SingleM currently seems to have an issue where it if runs out of RAM at a particular place, then it will hang, rather than die properly (yey python multithreading..)

In aviary, there's a one singlem rule, which analyses all input datasets at once, though this can be easily parallelised by doing each individually and then combining the OTU tables with singlem summarise. So I propose doing that, but wasn't sure if there was a template to work from in terms of the snakemake rules.

The current script uses these as input:

    long_reads = snakemake.config['long_reads']
    short_reads_1 = snakemake.config['short_reads_1']
    short_reads_2 = snakemake.config['short_reads_2']

And we'd want to run short reads in pairs where possible, and long reads separately. Is there some way of converting those configs into wildcards so the singlem pipe commands can be run as independent snakemake rules?

ta