Closed burtonjake closed 1 month ago
All the libraries used across the current MR. PARETO Snakefiles:
unsupervised_analysis
import os
import sys
import pandas as pd
import yaml
from snakemake.utils import min_version
---
spilterlize_integrate
import yaml
import pandas as pd
import os
from snakemake.utils import validate, min_version
import json
import csv
import sys
import subprocess
---
dea_limma
import os
import sys
import pandas as pd
import yaml
from snakemake.utils import min_version
---
enrichment_analysis
import yaml
import pandas as pd
import os
from snakemake.utils import validate, min_version
import json
import csv
import sys
import subprocess
---
genome_tracks
# libraries
import pandas as pd
import os
import gzip
import re # for regular expressions
import numpy as np
import json
from snakemake.utils import min_version
import hashlib # generating unique sample names for single-cell samples
---
atacseq_pipeline
import yaml
import pandas as pd
import os
import shutil
from snakemake.utils import validate, min_version
from string import Template
---
scrnaseq_processing_seurat
import os
import sys
import pandas as pd
import yaml
from snakemake.utils import min_version
---
dea_seurat
import os
import sys
import pandas as pd
import yaml
from snakemake.utils import min_version
---
mixscape_seurat
import os
import sys
import pandas as pd
import yaml
from snakemake.utils import min_version
It seems like the only one's that are not from the Python Standard Library are pandas, numpy and pyyaml. We should at least document the numpy and pandas dependencies.
Let's find out which packages actually come with the minimal snakemake installation (the one that would create a problem because of missing packages). e.g., is yaml/pyaml for sure in there? then we could make one global.yaml
and just copy paste it into every module.
It is indeed:
(base) [jburton@d002 ~]$ mamba create --dry-run -c conda-forge -c bioconda -n snakemake8-mini snakemake-minimal
Looking for: ['snakemake-minimal']
conda-forge/noarch 15.9MB @ 26.1MB/s 0.6s
conda-forge/linux-64 36.7MB @ 39.7MB/s 0.9s
bioconda/linux-64 5.6MB @ 4.0MB/s 1.4s
bioconda/noarch 5.3MB @ 3.7MB/s 1.4s
Transaction
Prefix: /nobackup/lab_bock/users/jburton/miniconda3/envs/snakemake8-mini
Updating specs:
- snakemake-minimal
Package Version Build Channel Size
──────────────────────────────────────────────────────────────────────────────────────────────────
Install:
──────────────────────────────────────────────────────────────────────────────────────────────────
+ python_abi 3.12 4_cp312 conda-forge Cached
+ _libgcc_mutex 0.1 conda_forge conda-forge Cached
+ ld_impl_linux-64 2.40 hf3520f5_7 conda-forge Cached
+ ca-certificates 2024.7.4 hbcca054_0 conda-forge Cached
+ libgomp 14.1.0 h77fa898_0 conda-forge Cached
+ _openmp_mutex 4.5 2_gnu conda-forge Cached
+ libgcc-ng 14.1.0 h77fa898_0 conda-forge Cached
+ libgfortran5 14.1.0 hc5f4f2c_0 conda-forge Cached
+ libstdcxx-ng 14.1.0 hc0a3c3a_0 conda-forge Cached
+ openssl 3.3.1 h4bc722e_2 conda-forge Cached
+ libzlib 1.3.1 h4ab18f5_1 conda-forge Cached
+ libxcrypt 4.4.36 hd590300_1 conda-forge Cached
+ libffi 3.4.2 h7f98852_5 conda-forge Cached
+ bzip2 1.0.8 h4bc722e_7 conda-forge Cached
+ yaml 0.2.5 h7f98852_2 conda-forge Cached
+ ncurses 6.5 h59595ed_0 conda-forge Cached
+ libuuid 2.38.1 h0b41bf4_0 conda-forge Cached
+ libnsl 2.0.1 hd590300_0 conda-forge Cached
+ libexpat 2.6.2 h59595ed_0 conda-forge Cached
+ xz 5.2.6 h166bdaf_0 conda-forge Cached
+ libgfortran-ng 14.1.0 h69a702a_0 conda-forge Cached
+ zstd 1.5.6 ha6fb4c9_0 conda-forge Cached
+ tk 8.6.13 noxft_h4845f30_101 conda-forge Cached
+ libsqlite 3.46.0 hde9e2c9_0 conda-forge Cached
+ readline 8.2 h8228510_1 conda-forge Cached
+ libopenblas 0.3.27 pthreads_hac2b453_1 conda-forge Cached
+ libblas 3.9.0 23_linux64_openblas conda-forge Cached
+ libcblas 3.9.0 23_linux64_openblas conda-forge Cached
+ liblapack 3.9.0 23_linux64_openblas conda-forge Cached
+ liblapacke 3.9.0 23_linux64_openblas conda-forge Cached
+ coin-or-utils 2.11.11 h8c65801_1 conda-forge Cached
+ coin-or-osi 0.108.10 haf5fa05_0 conda-forge Cached
+ coin-or-clp 1.17.8 h1ee7a9c_0 conda-forge Cached
+ coin-or-cgl 0.60.7 h516709c_0 conda-forge Cached
+ coin-or-cbc 2.10.11 h56f689f_0 conda-forge Cached
+ tzdata 2024a h0c530f3_0 conda-forge Cached
+ coincbc 2.10.11 0_metapackage conda-forge Cached
+ python 3.12.4 h194c7f8_0_cpython conda-forge Cached
+ wheel 0.44.0 pyhd8ed1ab_0 conda-forge 59kB
+ setuptools 72.1.0 pyhd8ed1ab_0 conda-forge 1MB
+ pip 24.2 pyhd8ed1ab_0 conda-forge Cached
+ pyparsing 3.1.2 pyhd8ed1ab_0 conda-forge Cached
+ pycparser 2.22 pyhd8ed1ab_0 conda-forge Cached
+ platformdirs 4.2.2 pyhd8ed1ab_0 conda-forge Cached
+ hyperframe 6.0.1 pyhd8ed1ab_0 conda-forge Cached
+ smmap 5.0.0 pyhd8ed1ab_0 conda-forge Cached
+ typing_extensions 4.12.2 pyha770c72_0 conda-forge Cached
+ zipp 3.19.2 pyhd8ed1ab_0 conda-forge Cached
+ attrs 24.1.0 pyh71513ae_0 conda-forge 56kB
+ pkgutil-resolve-name 1.3.10 pyhd8ed1ab_1 conda-forge Cached
+ traitlets 5.14.3 pyhd8ed1ab_0 conda-forge Cached
+ python-fastjsonschema 2.20.0 pyhd8ed1ab_0 conda-forge Cached
+ charset-normalizer 3.3.2 pyhd8ed1ab_0 conda-forge Cached
+ hpack 4.0.0 pyh9f0ad1d_0 conda-forge Cached
+ pysocks 1.7.1 pyha2e5f31_6 conda-forge Cached
+ idna 3.7 pyhd8ed1ab_0 conda-forge Cached
+ certifi 2024.7.4 pyhd8ed1ab_0 conda-forge Cached
+ plac 1.4.3 pyhd8ed1ab_0 conda-forge Cached
+ argparse-dataclass 2.0.0 pyhd8ed1ab_0 conda-forge Cached
+ dpath 2.2.0 pyha770c72_0 conda-forge Cached
+ throttler 1.2.2 pyhd8ed1ab_0 conda-forge Cached
+ stopit 1.1.2 py_0 conda-forge Cached
+ reretry 0.11.8 pyhd8ed1ab_0 conda-forge Cached
+ tabulate 0.9.0 pyhd8ed1ab_1 conda-forge Cached
+ packaging 24.1 pyhd8ed1ab_0 conda-forge Cached
+ humanfriendly 10.0 pyhd8ed1ab_6 conda-forge Cached
+ docutils 0.21.2 pyhd8ed1ab_0 conda-forge Cached
+ configargparse 1.7 pyhd8ed1ab_0 conda-forge Cached
+ appdirs 1.4.4 pyh9f0ad1d_0 conda-forge Cached
+ toposort 1.10 pyhd8ed1ab_0 conda-forge Cached
+ connection_pool 0.0.3 pyhd3deb0d_0 conda-forge Cached
+ gitdb 4.0.11 pyhd8ed1ab_0 conda-forge Cached
+ importlib_resources 6.4.0 pyhd8ed1ab_0 conda-forge Cached
+ h2 4.1.0 pyhd8ed1ab_0 conda-forge Cached
+ amply 0.1.6 pyhd8ed1ab_0 conda-forge Cached
+ gitpython 3.1.43 pyhd8ed1ab_0 conda-forge Cached
+ psutil 6.0.0 py312h9a8786e_0 conda-forge Cached
+ markupsafe 2.1.5 py312h98912ed_0 conda-forge Cached
+ rpds-py 0.19.1 py312hf008fa9_0 conda-forge Cached
+ brotli-python 1.1.0 py312h30efb56_1 conda-forge Cached
+ wrapt 1.16.0 py312h98912ed_0 conda-forge Cached
+ pyyaml 6.0.1 py312h98912ed_1 conda-forge Cached
+ datrie 0.8.2 py312h98912ed_7 conda-forge Cached
+ immutables 0.20 py312h98912ed_1 conda-forge Cached
+ cffi 1.16.0 py312hf06ca03_0 conda-forge Cached
+ jupyter_core 5.7.2 py312h7900ff3_0 conda-forge Cached
+ pulp 2.8.0 py312h7900ff3_0 conda-forge Cached
+ zstandard 0.23.0 py312h3483029_0 conda-forge Cached
+ snakemake-interface-common 1.17.2 pyhdfd78af_0 bioconda Cached
+ snakemake-interface-storage-plugins 3.2.3 pyhdfd78af_0 bioconda Cached
+ snakemake-interface-executor-plugins 9.2.0 pyhdfd78af_0 bioconda Cached
+ snakemake-interface-report-plugins 1.0.0 pyhdfd78af_0 bioconda Cached
+ jinja2 3.1.4 pyhd8ed1ab_0 conda-forge Cached
+ referencing 0.35.1 pyhd8ed1ab_0 conda-forge Cached
+ smart_open 7.0.4 pyhd8ed1ab_0 conda-forge Cached
+ conda-inject 1.3.2 pyhd8ed1ab_0 conda-forge Cached
+ yte 1.5.4 pyha770c72_0 conda-forge Cached
+ urllib3 2.2.2 pyhd8ed1ab_1 conda-forge Cached
+ jsonschema-specifications 2023.12.1 pyhd8ed1ab_0 conda-forge Cached
+ requests 2.32.3 pyhd8ed1ab_0 conda-forge Cached
+ jsonschema 4.23.0 pyhd8ed1ab_0 conda-forge Cached
+ nbformat 5.10.4 pyhd8ed1ab_0 conda-forge Cached
+ snakemake-minimal 8.16.0 pyhdfd78af_0 bioconda Cached
Summary:
Install: 103 packages
Total download: 2MB
Additionally these are the numpy and pandas versions that come with the big snakemake8:
numpy 2.0.1 py312h1103770_0 conda-forge
pandas 2.2.2 py312h1d6d2e6_1 conda-forge
I checked the RNA pipeline again and interestingly they only use the resource
key once here: https://github.com/snakemake-workflows/rna-seq-star-deseq2/blob/993dcfcf3c1210f75f6bfb0ef765a4ddb77cadf7/workflow/rules/ref.smk#L51
in all other rules they only use the threads
key.
Unsure what to do ie how to decouple resource specifications form workflow configuration meaningfully without adding complexity dor the enduser.
To bump all modules to snakemake 8 (#12) we primarily need to document the libraries required to process the initial Snakemake file. These should be added to
envs/global.yaml
and cross-linked into the Snakemake file with aconda
directive. See 'global workflow dependencies' in the snakemake docs. We also need to check the documentation for each module such that it makes sense for Snakemake8.one-off (first in unsupervised_analysis)
envs/global.yaml
specifying required software for Snakefile executionconfig.yaml
for each module
partition
parameter fromparams
workflow/profiles/default/config.yaml
config/README.md
envs/global.yaml
to workflowconda
directive inSnakefile
for global specifications at the beginning of your workflow, before you import or use any of those additional packages8.20.1
Modules