nextstrain / pathogen-repo-guide

4 stars 1 forks source link

Add standard logging of config values #18

Open joverlee521 opened 10 months ago

joverlee521 commented 10 months ago

Context

With many layers of Snakemake configs provided via default configs and/or CLI options (--configfile/--config), it is helpful to have a standard way of logging the config values used for a workflow run.

Possible solutions

  1. This is done in the ncov workflow with a dump_config rule. Users must specify the target with the same configs as their workflow run to see the config output.

  2. We could print out the config with each workflow run using the onstart handler. However, Snakemake docs note that these handlers are not triggered during dry-runs.

onstart:
    import yaml, sys
    yaml.dump(config, sys.stdout, explicit_start = True, explicit_end = True)
  1. We could print out the config with each workflow run using Snakemake's logger:
    
    import yaml
    from snakemake.logging import logger

Use default configuration values. Override with Snakemake's --configfile/--config options.

configfile: "config/defaults.yaml"

logger.info(f"Config is:\n{yaml.dump(config, explicit_start = True, explicit_end = True)}")


4. If the config output is too noisy, we can make it a debug level log that will only output if users provide the `--verbose` flag.
```python
import yaml
from snakemake.logging import logger

# Use default configuration values. Override with Snakemake's --configfile/--config options.
configfile: "config/defaults.yaml"

logger.debug(f"Config is:\n{yaml.dump(config, explicit_start = True, explicit_end = True)}")
tsibley commented 8 months ago

Option 3 is enticing because it means the actual config in use is always in build logs, so when something unexpectedly goes wrong you can inspect the config (without having to reconstruct it in a separate subsequent run).

jameshadfield commented 8 months ago

Option 3 👍