snakemake / snakemake

This is the development home of the workflow management system Snakemake. For general information, see
https://snakemake.github.io
MIT License
2.28k stars 554 forks source link

Bring --draft-notebook to basic python scripts #2742

Open moritzschaefer opened 8 months ago

moritzschaefer commented 8 months ago

Is your feature request related to a problem? Please describe. I like to write scripts interactively (as is nicely supported via --edit-notebook), but I dislike the jupyter notebook ecosystem.

Describe the solution you'd like It would be great to have an option --draft-script or --print-snakemake-object-python that prints the python code to populate the snakemake object, as in the first cell of the notebook launched by --edit-notebook.

Describe alternatives you've considered

Additional context

vandalt commented 2 months ago

Hi,

I would also be interested in this feature and took some time to look into it.

As a temporary workaround, I saw that there is a mock_snakemake() function in the PyPSA/pypsa-eur repository that does something along those line (link to original PR. I'm attaching below a simplified version of their helpers.py file with only that function and a small wrapper I created to do a try/except so that scripts would work both with snakemake and with a regular IPython or Python session.

@johanneskoester would there be interest for a feature like this? If so I would be happy to try to contribute. I guess that within Snakemake the mock_snakemake() would not be an optimal implementation? I can see at least two paths to achieve this:

Let me know if I should explore one of the following options. I'm new to Snakemake so there might very well be something I overlooked that makes this impossible or more complicated than I thought.

Thank you!

(Fixing this might also help with #247 and #2932 re debugging snakemake script with any editor's debugger)

The modified helpers.py with only mock_snakemake ```python # -*- coding: utf-8 -*- # SPDX-FileCopyrightText: : 2017-2024 The PyPSA-Eur Authors # # SPDX-License-Identifier: MIT import logging from pathlib import Path logger = logging.getLogger(__name__) def mockwrap( rulename, root_dir=None, configfiles=None, submodule_dir="workflow/submodules/pypsa-eur", **wildcards, ): try: from snakemake.script import snakemake except ImportError: from helpers import mock_snakemake snakemake = mock_snakemake( rulename, root_dir=root_dir, configfiles=configfiles, submodule_dir=submodule_dir, **wildcards, ) return snakemake def mock_snakemake( rulename, root_dir=None, configfiles=None, submodule_dir="workflow/submodules/pypsa-eur", **wildcards, ): """ This function is expected to be executed from the 'scripts'-directory of ' the snakemake project. It returns a snakemake.script.Snakemake object, based on the Snakefile. If a rule has wildcards, you have to specify them in **wildcards. Parameters ---------- rulename: str name of the rule for which the snakemake object should be generated root_dir: str/path-like path to the root directory of the snakemake project configfiles: list, str list of configfiles to be used to update the config submodule_dir: str, Path in case PyPSA-Eur is used as a submodule, submodule_dir is the path of pypsa-eur relative to the project directory. **wildcards: keyword arguments fixing the wildcards. Only necessary if wildcards are needed. """ import os import snakemake as sm from snakemake.api import Workflow from snakemake.common import SNAKEFILE_CHOICES from snakemake.script import Snakemake from snakemake.settings.types import ( ConfigSettings, DAGSettings, ResourceSettings, StorageSettings, WorkflowSettings, ) script_dir = Path(__file__).parent.resolve() if root_dir is None: root_dir = script_dir.parent else: root_dir = Path(root_dir).resolve() user_in_script_dir = Path.cwd().resolve() == script_dir if str(submodule_dir) in __file__: # the submodule_dir path is only need to locate the project dir os.chdir(Path(__file__[: __file__.find(str(submodule_dir))])) elif user_in_script_dir: os.chdir(root_dir) elif Path.cwd().resolve() != root_dir: raise RuntimeError( "mock_snakemake has to be run from the repository root" f" {root_dir} or scripts directory {script_dir}" ) try: for p in SNAKEFILE_CHOICES: if os.path.exists(p): snakefile = p break if configfiles is None: configfiles = [] elif isinstance(configfiles, str): configfiles = [configfiles] resource_settings = ResourceSettings() config_settings = ConfigSettings(configfiles=map(Path, configfiles)) workflow_settings = WorkflowSettings() storage_settings = StorageSettings() dag_settings = DAGSettings(rerun_triggers=[]) workflow = Workflow( config_settings, resource_settings, workflow_settings, storage_settings, dag_settings, storage_provider_settings=dict(), ) workflow.include(snakefile) if configfiles: for f in configfiles: if not os.path.exists(f): raise FileNotFoundError(f"Config file {f} does not exist.") workflow.configfile(f) workflow.global_resources = {} rule = workflow.get_rule(rulename) dag = sm.dag.DAG(workflow, rules=[rule]) wc = dict(wildcards) job = sm.jobs.Job(rule, dag, wc) def make_accessable(*ios): for io in ios: for i, _ in enumerate(io): io[i] = os.path.abspath(io[i]) make_accessable(job.input, job.output, job.log) snakemake = Snakemake( job.input, job.output, job.params, job.wildcards, job.threads, job.resources, job.log, job.dag.workflow.config, job.rule.name, None, ) # create log and output dir if not existent for path in list(snakemake.log) + list(snakemake.output): Path(path).parent.mkdir(parents=True, exist_ok=True) finally: if user_in_script_dir: os.chdir(script_dir) return snakemake ```