Closed kbseah closed 8 months ago
Rules for inclusion in the Snakemake workflow catalog: https://snakemake.github.io/snakemake-workflow-catalog/?rules=true
Observed with Snakemake 8:
environment variables dumped in a "wall of text", unclear what's triggering it, seems to be what's reported here: https://github.com/snakemake/snakemake/issues/2624
Snakemake recommended folder structure: https://snakemake.readthedocs.io/en/stable/snakefiles/deployment.html#distribution-and-reproducibility
I think that the workflow (rules etc.) should be in dedicated subfolder. The pipeline should simply be forked and modified for each new dataset (alternatively it could be uploaded to WorkflowHub in the future).
Detailed explanation:
Current usage scenario of the pipeline: User clones a single copy of the workflow, and uses the same workflow to analyze multiple datasets by writing individual config files for each. The input and output paths for each dataset are specified in the config files and are independent of the workflow, i.e. the input/output folders are not necessarily subfolders of the workflow. In order to accommodate this usage pattern, the
workdir
is manually specified and also not necessarily the path at which the Snakemake command is run.The original motivation AFAIK was:
However,
workdir
is specified separately from the actual path where Snakemake is run, which in turn may be different from the pipeline workflow path. Snakemake implicitly creates a hidden .snakemake folder in the folder where Snakemake is invoked (for logs of the Snakemake runs), but it also creates a hidden .snakemake folder in the manually specifiedworkdir
for the Conda envs and rule-specific logs.workdir
and Snakefile paths are specified. (Bug disappears with Snakemake v7+).