WayScience / CytoSnake

Orchestrating high-dimensional cell morphology data processing pipelines
https://cytosnake.readthedocs.io
Creative Commons Attribution 4.0 International
3 stars 3 forks source link

Create a `get_output()` helper function #62

Open axiomcura opened 1 year ago

axiomcura commented 1 year ago

Currently, there are repeated code of generating the output. In snakemake files

Here's the code below:

# cp_process exanple `rule all`
rule all:
    input:
        get_data_path(input_type="aggregated", tolist=True),
        get_data_path(input_type="cell_counts", tolist=True),
        get_data_path(input_type="annotated", tolist=True),
        get_data_path(input_type="normalized", tolist=True),
        get_data_path(input_type="feature_select", tolist=True),
        get_data_path(input_type="consensus", tolist=True),

There are a lot of function calls

A solution would be creating a wrapper helper function known as get_output that will take workflow paramter. The workflow parameter expects a path to the workflow_config.yaml file (example: cp_process.yaml ). Since the information what outputs are being generated, we can let the wrapper function know what paths to produce based on those configs.

it will look something like this (using same example above)

# cp_process exanple `rule all`
rule all:
    input:
        get_outputs(workflow="cp_process")

This will remove redundancy and improve readability