uab-cgds-worthey / quac

🦆 Quality Control of WGS and exome samples 🦆
https://quac.readthedocs.io
GNU General Public License v3.0
5 stars 1 forks source link

QuaC stalls at singularity+conda env creation step for multiqc #47

Closed ManavalanG closed 1 year ago

ManavalanG commented 2 years ago

QuaC stalls when creating singularity+conda environments when it tries to create the env for multiqc.

ManavalanG commented 2 years ago

Here is the relevant env definition file:

channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - multiqc ==1.9
ManavalanG commented 2 years ago

I attempted to create conda env (not singularity+conda) in cheaha to see if the issue can be replicated there. Conda just stalled at this step - Invoking SAT with clause count even after 15mins of executing the command. Relevant parts of the stdout/stderr:

Solving environment...

When --verbose --verbose flag added to probe further,

DEBUG conda.resolve:solve(1340): Requested specs:
  - __archspec
  - __glibc
  - __linux
  - __unix
  - multiqc==1.9
DEBUG conda.resolve:solve(1341): Optional specs:
DEBUG conda.resolve:solve(1342): All other specs:
  - _libgcc_mutex
  - _openmp_mutex
........
........
........
DEBUG conda.resolve:solve(1343): missing specs:
  - _libgcc_mutex
  - _openmp_mutex
........
........
........
DEBUG conda.resolve:solve(1346): Solve: minimize removed packages
DEBUG conda.resolve:solve(1353): Solve: maximize versions of requested packages
DEBUG conda.common._logic:minimize(664): Empty objective, trivial solution
DEBUG conda.common._logic:minimize(664): Empty objective, trivial solution
DEBUG conda.resolve:solve(1357): Initial package channel/version metric: 0/0
DEBUG conda.resolve:solve(1360): Solve: minimize track_feature count
DEBUG conda.resolve:generate_feature_count(933): generate_feature_count returning with clause count: 128071
DEBUG conda.common._logic:minimize(658): Clauses added, recomputing solution
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 128071
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 128083
DEBUG conda.common._logic:minimize(745): Final sum objective: 0
DEBUG conda.resolve:solve(1363): Track feature count: 0
DEBUG conda.common._logic:minimize(658): Clauses added, recomputing solution
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 129436
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 129887
DEBUG conda.common._logic:minimize(745): Final sum objective: 0
DEBUG conda.resolve:solve(1374): Package misfeature count: 0
DEBUG conda.resolve:solve(1377): Solve: maximize build numbers of requested packages
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 129888
DEBUG conda.common._logic:minimize(745): Final sum objective: 0
DEBUG conda.resolve:solve(1379): Initial package build metric: 0
DEBUG conda.resolve:solve(1382): Solve: prefer arch over noarch for requested packages
DEBUG conda.common._logic:minimize(664): Empty objective, trivial solution
DEBUG conda.resolve:solve(1384): Noarch metric: 0
DEBUG conda.resolve:solve(1388): Solve: minimize number of optional installations
DEBUG conda.common._logic:minimize(664): Empty objective, trivial solution
DEBUG conda.resolve:solve(1391): Optional package install metric: 0
DEBUG conda.resolve:solve(1394): Solve: minimize number of necessary upgrades
DEBUG conda.common._logic:minimize(664): Empty objective, trivial solution
DEBUG conda.resolve:solve(1397): Dependency update count: 0
DEBUG conda.resolve:solve(1400): Solve: maximize versions and builds of indirect dependencies.  Prefer arch over noarch where equivalent.
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 132814
DEBUG conda.common._logic:minimize(745): Final peak objective: 0
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 142063
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 132989
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 133651
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 133169
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 133389
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 133270
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 133344
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 133296
DEBUG conda.common._logic:minimize(745): Final peak objective: 39
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 13942923
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 7270918
DEBUG conda.common._logic:_run_sat(607): Invoking SAT with clause count: 3923857

As per this stackoverflow answer, Invoking SAT with clause count: is known to be associated with stalling :(

conda can run into endless loops when trying to figure out a way to satisfy all the dependencies of the packages that are installed, along with the changes you want to apply. I've read somewhere that if it hasn't found a solution within two minutes, there's not much point in waiting any longer.

In general, you overcome such situations by being more specific about what you want. For example, specify exact versions instead of just package names. If you don't need conda-forge, avoid that channel, because it brings in a lot of metadata that the SAT resolver will try to consider.

Here's a huge blog article from Anaconda with ways to improve performance: https://www.anaconda.com/understanding-and-improving-condas-performance/

ManavalanG commented 2 years ago

Interestingly, when conda stalls with message Invoking SAT with clause count, it can't be killed using ctrl+C. Don't know why.

ManavalanG commented 2 years ago

I tested with modified conda env config by adding python as a requirement and this resolved the issue. Full config:

channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - python =3.6
  - multiqc ==1.9

I'm not fully certain why but locking the python version makes the conda env creation process successful.

ManavalanG commented 2 years ago

Potential workaround to this problem is to use a local modified snakemake wrapper, an approach we have successfully employed to resolve wrapper related issue in the small-var-caller pipeline.

ManavalanG commented 2 years ago

mentioned in commit 7db6966ee0e3e969e3bc2e5196e4824b003473a6

ManavalanG commented 2 years ago

mentioned in commit e0208acdc535572bdcb19e5c61f8e0ebd0d99c39

ManavalanG commented 2 years ago

Initiated fresh singularity+conda env creation for all envs using the instructions from readme. The job took 3hr18min to complete. Job info:

       JobID    JobName   NTasks     MaxRSS    Elapsed      State        NodeList     ReqMem  MaxVMSize  AveVMSize  Partition                                AllocTRES
------------ ---------- -------- ---------- ---------- ---------- --------------- ---------- ---------- ---------- ---------- ----------------------------------------
13110540     quac-2022+                       03:18:14  COMPLETED           c0200     8192Mn                            short         billing=1,cpu=1,mem=8192M,node=1
13110540.ba+      batch        1   2618.23M   03:18:14  COMPLETED           c0200     8192Mn   3581.57M   3581.57M                              cpu=1,mem=8192M,node=1
13110540.ex+     extern        1      1.57M   03:18:14  COMPLETED           c0200     8192Mn    188.57M    188.57M                    billing=1,cpu=1,mem=8192M,node=1

Attempted to compare with the past logs to see how long this task took in the past, but I couldn't find the relevant log file. In case time-taken keeps increasing in future, we may have to define conda env configs with specific dependencies. Currently it lists only the major tools and not any of the underlying dependencies.

ManavalanG commented 2 years ago

Some background info and some learnings. We were aware of this issue before but we may need to act on this issue soon.

ManavalanG commented 2 years ago

Note that we ran into variation of this issue here - https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/pipelines/small_variant_caller_pipeline/-/issues/41

ManavalanG commented 2 years ago

Testing now. Instead of using local wrapper, just define conda: directive with wrapper: and see if snakemake would use locally defined conda env. Advantage here is to avoid use of local wrapper.

ManavalanG commented 2 years ago

mentioned in commit 9712962ace731f70b37ebebce939be58ff1ffe6e

ManavalanG commented 2 years ago

Test was successful.

Note: The job to create singularity+conda environment for just multiqc env took less than 4mins to complete.

ManavalanG commented 2 years ago

Rule multiqc_by_sample_initial_pass ran into following error. I suspect this has to do with something wrong with conda env created. This reported error might be a variation of what we are seeing here.

Traceback (most recent call last):
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/bin/multiqc", line 6, in <module>
    from multiqc.__main__ import multiqc
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/multiqc/__init__.py", line 16, in <module>
    from .multiqc import run
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/multiqc/multiqc.py", line 38, in <module>
    from .plots import table
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/multiqc/plots/table.py", line 9, in <module>
    from multiqc.utils import config, report, util_functions, mqc_colour
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/multiqc/utils/mqc_colour.py", line 7, in <module>
    import spectra
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/spectra/__init__.py", line 1, in <module>
    from .core import COLOR_SPACES, Color, Scale
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/spectra/core.py", line 1, in <module>
    from colormath import color_objects, color_conversions
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/colormath/color_conversions.py", line 13, in <module>
    import networkx
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/networkx/__init__.py", line 81, in <module>
    from networkx import algorithms
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/networkx/algorithms/__init__.py", line 81, in <module>
    from networkx.algorithms import tree
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/networkx/algorithms/tree/__init__.py", line 1, in <module>
    from .branchings import *
  File "quac/.snakemake/conda/9c3107da6d16b7941e8a8041bae2d495/lib/python3.6/site-packages/networkx/algorithms/tree/branchings.py", line 30, in <module>
    from dataclasses import dataclass, field
ModuleNotFoundError: No module named 'dataclasses'
Traceback (most recent call last):
  File "quac/.snakemake/scripts/tmp6z72phym.wrapper.py", line 24, in <module>
    "multiqc"
  File "/mnt/snakemake/snakemake/shell.py", line 231, in __new__
    raise sp.CalledProcessError(retcode, cmd)
ManavalanG commented 2 years ago

To circumvent this error, I exported the env definition of conda env created for multiqc (from v1.0 quac) using command:

cd quac/1.0/.snakemake/conda

# identify env relevant to multiqc rules
grep multiqc *yaml

# now activate the corresponding conda env
conda activate full_path_to_conda

# export the env definition
conda env export > path_to_multiqc_env.yaml
ManavalanG commented 2 years ago

mentioned in commit e48521b5ed1add1c2c0bb1fec104a8c027cf591d

ManavalanG commented 2 years ago

mentioned in commit df9cde51060991beadd5bb7e966ae19b18116369