snakemake - Githubissues

beelabhmc / ant_tracker

Track ant movement in a lab setting using ML

http://hmcbee.blogspot.com/

1 stars 1 forks source link

snakemake #8

Closed aryarm closed 5 years ago

aryarm commented 6 years ago

once we have a working pipeline, it might be nice to convert it to a snakemake pipeline so that it can be run in parallel on a cluster machine

aryarm commented 5 years ago

Before we do this, we should make sure to split the pipeline up into smaller steps, since snakemake likes to have control over every small step in the pipeline so that it can manage job control. This will probably require breaking up each of the python scripts.

Some of the python code simply calls terminal commands. So we might consider converting some of the python code to bash scripts, which will be easier to terminate and control from snakemake.

JarredAllen commented 5 years ago

I think I created a snakemake pipeline for this with commit 085775ef6a470b7bc4b694ca541eaf9013916780. However, Snakemake requires using python3, so this is on hold until I can get the code onto a newer linux server.

aryarm commented 5 years ago

Nice! It looks good. You'll probably need a config file eventually, so you can fill the wildcards in the Snakefile (otherwise, snakemake won't know which wildcards to use when it calls your rules). I usually use the config file to specify the paths to inputs to the pipeline. Additionally, config variables can be overridden from the command line, so it's super to easy to switch out inputs (or use a subset of them if you write some code in your Snakefile to do it). Here's an example Snakemake pipeline I've worked with, if that helps.

One nice thing about snakemake is that you can run separate parts of the pipeline in different conda environments (ie you can call the entire pipeline from a python3 environment but have it execute its rules in a python2 environment). I started trying to create an environment file in f73028b. Also see issue #6.

JarredAllen commented 5 years ago

The steps which are completed have been written into a Snakefile and the pipeline is working.

Future steps to do:

extract the command-line options to a config file, allowing for them to be more easily changed
automatically detect the number of ROIs in the snakefile, instead of requiring the user to check.
automate more steps (recombine tracks at the end, detect ROIs, etc.)

JarredAllen commented 5 years ago

The last two steps have been completed, so the only thing left to do with the pipeline is to extract the command-line options and other things which I may be tweaked into a separate configuration file, instead of the current setup, where one is a global constant defined in the pipeline and the others are forced to be the default values.

JarredAllen commented 5 years ago

I moved that last step into a new issue because it's a separate thing from making a pipeline.

The issue is here: https://github.com/beelabhmc/ant_tracker/issues/20