Open charlesreid1 opened 6 years ago
Proposal for work moving forward:
In the documentation, we should separate ((the installation of taco)) from ((the installation of things needed to run the workflows that taco wraps)) - think about taco as a tool that is assumed to run separately from the compute node
We need to develop a cloud URL model for rules - use the taco-simple workflow for that since it does not require any extras and can easily run taco and the compute task on the same node
We need to make taco cluster-capable - the assembly/read filtering workflows are a good test for submitting jobs to clusters using taco
Conceptual point of clarification: taco is not intended to be the end-all, be-all workflow runner tool. It relies heavily on Snakemake's functionality. It is intended to do exactly what dahak-metagenomics/dahak does, which is run workflows with user-provided parameters, but with a cleaner, simpler user interface. (and hopefully simpler snakemake rules on the back end.)
Tying in #6 (overlay model: workflows + metadata) here, since it seems relevant. This is an increase in scope to thinking about making workflows "importable".
This is related to the cloud/URL model for rules, but it would shift the focus away from ((user translates their workflows into rules for taco)) and toward ((taco does all the hard work of importing the already-written workflow))
We're thinking through where we are running taco, what workflows it submits, and where those workflows run.
Starting with the documentation's first page, installation, we give instructions for installing snakemake and singularity on the same node that will run taco. This seems to limit us to a single-node model. What if we are submitting jobs to clusters?
We have to think about taco as just a thin wrapper around snakemake, so whatever model we're currently using for snakemake, we use for taco. The answer to the question "where does taco run?" is the same as the answer to the question "where does snakemake run?"
Therefore we want to use the following abstraction:
We also have to think about #8 (using a cloud/URL model for getting the taco rules/workflow instructions) - which is intended to remove the need for the user to have a local copy of the workflow they want to run.