I have a couple of error check rules to ensure a correct conig setup. I give those rules a higher priority so that they run first and it fails fast. It would be great if I could create a test using the snakemake giuthub action that ensures that a test with a bad config does in fact exit non-zero.
Here are 2 examples of such rules:
rule check_peak_read_len_overlap_params:
# This code throws an error if the fraction of the minimum peak (or
# summit) width (based on MAX_ARTIFACT_WIDTH and SUMMIT_FLANK_SIZE) over
# the max read length is less than the fracOverlap (FRAC_READ_OVERLAP) that
# is supplied to featureCounts.
input:
"results/QC/max_read_length.txt",
output:
"results/QC/parameter_validation.txt",
params:
frac_read_overlap=FRAC_READ_OVERLAP,
max_artifact_width=MAX_ARTIFACT_WIDTH,
summit_flank_size=SUMMIT_FLANK_SIZE,
log:
"results/QC/logs/parameter_validation.log",
conda:
"../envs/python3.yml"
script:
"../scripts/check_peak_read_len_overlap_params.py"
rule check_biological_replicates:
# There is no input or output for this rule. It depends on the params. The log.status file is included in the all rule when metadata is present (see common.smk).
params:
conditions="\n\t".join(
[
f"{nonrep['dataset']}:{nonrep['experimental_condition']}"
for nonrep in SAMPLES_WITHOUT_BIOLOGICAL_REPLICATES
]
),
samples="\n\t".join(
[
f"{nonrep['biological_sample']}:{','.join(nonrep['sample_ids'])}"
for nonrep in SAMPLES_WITHOUT_BIOLOGICAL_REPLICATES
]
),
priority: 1
log:
err="results/logs/sample_status.err",
status="results/logs/sample_status.txt",
conda:
"../envs/bedtools_coreutils_gawk_gzip.yml"
shell:
"""
if [ "{params.conditions}" == "" ]; then \
echo "STATUS=GOOD. No non-replicates detected with metadata." > {log.status:q}; \
touch {log.err:q}; \
else \
echo "STATUS=BAD. Non-replicates detected with metadata." > {log.status:q}; \
MSG="NONREPLICATES DETECTED ERROR: Biological replicates are required in each experimental condition. The following 'dataset:experimental_condition's have only a single biological sample:\n\n\t"; \
MSG="${{MSG}}{params.conditions}\n\n"; \
MSG="${{MSG}}There are 4 ways to deal with this error:\n\n"; \
MSG="${{MSG}}1. Add samples to each of the conditions listed above (or add the experimental conditions to exiting samples that do not currently have an annotated experimental condition).\n"; \
MSG="${{MSG}}2. Remove all metadata from the sample sheet (except sample_id and dataset) for the following biological samples:IDs:\n\n\t"; \
MSG="${{MSG}}{params.samples}\n\n"; \
MSG="${{MSG}}3. Remove all rows from the sample sheet for the biological samples:IDs shown above under option 2.\n"; \
MSG="${{MSG}}4. Skip the replicate error check by adding the following option to your snakemake command:\n\n\t--omit-from check_biological_replicates\n\n"; \
printf "$MSG" > {log:q}; \
exit 1; \
fi
"""
I have worked around this issue by using the base environment's install of snakemake, but it would be great if I didn't have to do that:
---
name: Test Snakemake Fails
"on": push
jobs:
run-lint:
runs-on: ubuntu-latest
defaults:
run:
# This enables running of conda(-installed) commands (e.g. `snakemake`)
# in the rules below. See:
# https://github.com/conda-incubator/setup-miniconda/issues/128
shell: bash -l {0}
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Install dependencies
uses: conda-incubator/setup-miniconda@v2
with:
python-version: "3.9"
auto-update-conda: true
use-mamba: true
mamba-version: "*"
miniforge-variant: Mambaforge
auto-activate-base: false
environment-file: environment.yml
channel-priority: true
activate-environment: ATACCompendium
- name: Display all conda & env info
run: |
conda info -a
conda list
conda config --show-sources
- name: Ensure error when any experimental condition contains no biological replicates
run: |
bash scripts/test_snakemake_fails.sh "$CONDA" --use-conda --cores 2 --directory .tests/test_6_missingreplicates
I have a couple of error check rules to ensure a correct conig setup. I give those rules a higher priority so that they run first and it fails fast. It would be great if I could create a test using the snakemake giuthub action that ensures that a test with a bad config does in fact exit non-zero.
Here are 2 examples of such rules:
I have worked around this issue by using the base environment's install of snakemake, but it would be great if I didn't have to do that: