rhysnewell / aviary

A hybrid assembly and MAG recovery pipeline (and more!)
GNU General Public License v3.0
81 stars 12 forks source link

Error with CheckM2 and Rosella #193

Closed ronjasan closed 8 months ago

ronjasan commented 8 months ago

Hi,

I have run aviary complete with Nanopore long reads, and get an error when executing rules checkm_metabat2 and checkm_semibin. For both rules I get the following error message regarding numpy: AttributeError: module 'numpy' has no attribute 'object'.

Snakemake log ``` Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 24 Rules claiming more threads will be scaled down. Provided resources: mem_mb=256000 Job stats: job count ---------------- ------- checkm2 1 checkm_das_tool 1 checkm_metabat2 1 checkm_semibin 1 das_tool 1 finalise_stats 1 get_abundances 1 gtdbtk 1 recover_mags 1 refine_dastool 1 refine_metabat2 1 refine_semibin 1 singlem_appraise 1 total 13 Select jobs to execute... [Fri Jan 12 15:20:00 2024] rule checkm_metabat2: input: data/metabat_bins_2/done output: data/metabat_bins_2/checkm2_out, data/metabat_bins_2/checkm.out log: logs/checkm_metabat2.log jobid: 6 reason: Missing output files: data/metabat_bins_2/checkm.out threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=480, gpus=0 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ [Fri Jan 12 15:20:10 2024] Error in rule checkm_metabat2: jobid: 6 input: data/metabat_bins_2/done output: data/metabat_bins_2/checkm2_out, data/metabat_bins_2/checkm.out log: logs/checkm_metabat2.log (check log file(s) for error details) conda-env: /mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ RuleException: CalledProcessError in file /mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/binning.smk, line 444: Command 'source /mnt/users/ronjasan/miniforge3/envs/aviary/bin/activate '/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_'; set -euo pipefail; python /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/.snakemake/scripts/tmpkh9btayl.run_checkm.py' returned non-zero exit status 1. File "/mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/binning.smk", line 444, in __rule_checkm_metabat2 File "/mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/concurrent/futures/thread.py", line 58, in run Select jobs to execute... [Fri Jan 12 15:20:10 2024] rule checkm_semibin: input: data/semibin_bins/done output: data/semibin_bins/checkm2_out, data/semibin_bins/checkm.out log: logs/checkm_semibin.log jobid: 20 reason: Missing output files: data/semibin_bins/checkm.out threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=480, gpus=0 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ [Fri Jan 12 15:20:14 2024] Error in rule checkm_semibin: jobid: 20 input: data/semibin_bins/done output: data/semibin_bins/checkm2_out, data/semibin_bins/checkm.out log: logs/checkm_semibin.log (check log file(s) for error details) conda-env: /mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ RuleException: CalledProcessError in file /mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/binning.smk, line 469: Command 'source /mnt/users/ronjasan/miniforge3/envs/aviary/bin/activate '/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_'; set -euo pipefail; python /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/.snakemake/scripts/tmpl4fu2eop.run_checkm.py' returned non-zero exit status 1. File "/mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/binning.smk", line 469, in __rule_checkm_semibin File "/mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/concurrent/futures/thread.py", line 58, in run Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2024-01-12T151954.328670.snakemake.log ```
checkm_metabat2 log ``` Traceback (most recent call last): File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/bin/checkm2", line 27, in from checkm2 import predictQuality File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/checkm2/predictQuality.py", line 1, in from checkm2 import modelProcessing File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/checkm2/modelProcessing.py", line 17, in from tensorflow import keras File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/__init__.py", line 41, in from tensorflow.python.tools import module_util as _module_util File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/__init__.py", line 46, in from tensorflow.python import data File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/__init__.py", line 25, in from tensorflow.python.data import experimental File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/experimental/__init__.py", line 96, in from tensorflow.python.data.experimental import service File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/experimental/service/__init__.py", line 140, in from tensorflow.python.data.experimental.ops.data_service_ops import distribute File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/experimental/ops/data_service_ops.py", line 25, in from tensorflow.python.data.experimental.ops import compression_ops File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/experimental/ops/compression_ops.py", line 20, in from tensorflow.python.data.util import structure File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/util/structure.py", line 26, in from tensorflow.python.data.util import nest File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/util/nest.py", line 41, in from tensorflow.python.framework import sparse_tensor as _sparse_tensor File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/framework/sparse_tensor.py", line 29, in from tensorflow.python.framework import constant_op File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/framework/constant_op.py", line 29, in from tensorflow.python.eager import execute File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 27, in from tensorflow.python.framework import dtypes File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/framework/dtypes.py", line 513, in np.object, File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__ raise AttributeError(__former_attrs__[attr]) AttributeError: module 'numpy' has no attribute 'object'. `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations Using CheckM2 database /mnt/databases/checkm2_db/CheckM2_database/uniref100.KO.1.dmnd ```
checkm_semibin log ``` Traceback (most recent call last): File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/bin/checkm2", line 27, in from checkm2 import predictQuality File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/checkm2/predictQuality.py", line 1, in from checkm2 import modelProcessing File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/checkm2/modelProcessing.py", line 17, in from tensorflow import keras File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/__init__.py", line 41, in from tensorflow.python.tools import module_util as _module_util File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/__init__.py", line 46, in from tensorflow.python import data File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/__init__.py", line 25, in from tensorflow.python.data import experimental File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/experimental/__init__.py", line 96, in from tensorflow.python.data.experimental import service File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/experimental/service/__init__.py", line 140, in from tensorflow.python.data.experimental.ops.data_service_ops import distribute File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/experimental/ops/data_service_ops.py", line 25, in from tensorflow.python.data.experimental.ops import compression_ops File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/experimental/ops/compression_ops.py", line 20, in from tensorflow.python.data.util import structure File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/util/structure.py", line 26, in from tensorflow.python.data.util import nest File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/data/util/nest.py", line 41, in from tensorflow.python.framework import sparse_tensor as _sparse_tensor File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/framework/sparse_tensor.py", line 29, in from tensorflow.python.framework import constant_op File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/framework/constant_op.py", line 29, in from tensorflow.python.eager import execute File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 27, in from tensorflow.python.framework import dtypes File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/tensorflow/python/framework/dtypes.py", line 513, in np.object, File "/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__ raise AttributeError(__former_attrs__[attr]) AttributeError: module 'numpy' has no attribute 'object'. `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations Using CheckM2 database /mnt/databases/checkm2_db/CheckM2_database/uniref100.KO.1.dmnd ```

I also get an error with rosella, where it does not produce bins due to a shape error.

rosella log ``` [2024-01-11T13:53:01Z INFO rosella] rosella version 0.5.1 [2024-01-11T13:53:01Z INFO rosella::recover::recover_engine] Calculating contig coverages. [2024-01-11T13:53:01Z INFO rosella::recover::recover_engine] Calculating TNF table. [2024-01-11T13:53:05Z ERROR rosella] Recover Failed with error: ShapeError/IncompatibleShape: incompatible shapes ```
rhysnewell commented 8 months ago

Hi there,

Thanks for using aviary, I just need to grab some extra info from you to help.

So the easy one, the rosella shape error has been fixed in more recent versions of aviary (version >= 0.8.3). This will also update rosella and make the pipeline a bit faster.

The checkm2 errors I suspect are due to conda pulling down the wrong version of tensorflow

Cheers, Rhys

ronjasan commented 8 months ago

I am using aviary v0.8.3, and I see that it has installed rosella v0.5.1 in the rosella environment. I will try updating rosella and see if it fixes that problem.

My .condarc looks like this:

auto_activate_base: false
channels:
  - conda-forge
  - bioconda
  - defaults
channel_priority: strict

The checkm2 environment has these packages installed:

Packages ``` # packages in environment at /mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge abseil-cpp 20200923.3 h9c3ff4c_0 conda-forge absl-py 2.0.0 pyhd8ed1ab_0 conda-forge aiohttp 3.9.1 py38h01eb140_0 conda-forge aiosignal 1.3.1 pyhd8ed1ab_0 conda-forge astor 0.8.1 pyh9f0ad1d_0 conda-forge astunparse 1.6.3 pyhd8ed1ab_0 conda-forge async-timeout 4.0.3 pyhd8ed1ab_0 conda-forge attrs 23.2.0 pyh71513ae_0 conda-forge blinker 1.7.0 pyhd8ed1ab_0 conda-forge boost-cpp 1.70.0 h7b93d67_3 conda-forge brotli-python 1.1.0 py38h17151c0_1 conda-forge bzip2 1.0.8 hd590300_5 conda-forge c-ares 1.25.0 hd590300_0 conda-forge ca-certificates 2023.11.17 hbcca054_0 conda-forge cachetools 4.2.4 pyhd8ed1ab_0 conda-forge certifi 2023.11.17 pyhd8ed1ab_0 conda-forge cffi 1.16.0 py38h6d47a40_0 conda-forge charset-normalizer 3.3.2 pyhd8ed1ab_0 conda-forge checkm2 1.0.2 pypi_0 pypi click 8.1.7 unix_pyh707e725_0 conda-forge colorama 0.4.6 pyhd8ed1ab_0 conda-forge cryptography 39.0.0 py38h1724139_0 conda-forge diamond 2.0.4 h56fc30b_0 bioconda frozenlist 1.4.1 py38h01eb140_0 conda-forge gast 0.3.3 py_0 conda-forge giflib 5.2.1 h0b41bf4_3 conda-forge google-auth 1.35.0 pyh6c4a22f_0 conda-forge google-auth-oauthlib 0.4.6 pyhd8ed1ab_0 conda-forge google-pasta 0.2.0 pyh8c360ce_0 conda-forge grpc-cpp 1.36.4 hf89561c_1 conda-forge grpcio 1.36.1 py38hdd6454d_0 conda-forge h5py 2.10.0 nompi_py38h9915d05_106 conda-forge hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge icu 67.1 he1b5a44_0 conda-forge idna 3.6 pyhd8ed1ab_0 conda-forge importlib-metadata 7.0.1 pyha770c72_0 conda-forge joblib 1.3.2 pyhd8ed1ab_0 conda-forge jpeg 9e h0b41bf4_3 conda-forge keras-preprocessing 1.1.2 pyhd8ed1ab_0 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.20.1 hf9c8cef_0 conda-forge ld_impl_linux-64 2.40 h41732ed_0 conda-forge libblas 3.9.0 20_linux64_openblas conda-forge libcblas 3.9.0 20_linux64_openblas conda-forge libcurl 7.87.0 h6312ad2_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 hd590300_2 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 13.2.0 h807b86a_3 conda-forge libgfortran-ng 13.2.0 h69a702a_3 conda-forge libgfortran5 13.2.0 ha4646dd_3 conda-forge libgomp 13.2.0 h807b86a_3 conda-forge liblapack 3.9.0 20_linux64_openblas conda-forge libnghttp2 1.51.0 hdcd2b5c_0 conda-forge libnsl 2.0.1 hd590300_0 conda-forge libopenblas 0.3.25 pthreads_h413a1c8_0 conda-forge libpng 1.6.39 h753d276_0 conda-forge libprotobuf 3.15.8 h780b84a_1 conda-forge libsqlite 3.44.2 h2797004_0 conda-forge libssh2 1.10.0 haa6b8db_3 conda-forge libstdcxx-ng 13.2.0 h7e041cc_3 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libzlib 1.2.13 hd590300_5 conda-forge lightgbm 3.2.1 py38h709712a_0 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge markdown 3.5.2 pyhd8ed1ab_0 conda-forge markupsafe 2.1.3 py38h01eb140_1 conda-forge multidict 6.0.4 py38h01eb140_1 conda-forge ncurses 6.4 h59595ed_2 conda-forge numpy 1.21.6 py38h1d589f8_0 conda-forge oauthlib 3.2.2 pyhd8ed1ab_0 conda-forge openssl 1.1.1w hd590300_0 conda-forge opt_einsum 3.3.0 pyhc1e730c_2 conda-forge packaging 23.2 pyhd8ed1ab_0 conda-forge pandas 1.5.3 py38hdc8b05c_1 conda-forge pip 23.3.2 pyhd8ed1ab_0 conda-forge platformdirs 4.1.0 pyhd8ed1ab_0 conda-forge pooch 1.8.0 pyhd8ed1ab_0 conda-forge prodigal 2.6.3 h031d066_7 bioconda protobuf 3.15.8 py38h709712a_0 conda-forge pyasn1 0.5.1 pyhd8ed1ab_0 conda-forge pyasn1-modules 0.3.0 pyhd8ed1ab_0 conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pyjwt 2.8.0 pyhd8ed1ab_0 conda-forge pyopenssl 23.2.0 pyhd8ed1ab_1 conda-forge pysocks 1.7.1 pyha2e5f31_6 conda-forge python 3.8.15 h257c98d_0_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python-flatbuffers 1.12 pyhd8ed1ab_1 conda-forge python_abi 3.8 4_cp38 conda-forge pytz 2023.3.post1 pyhd8ed1ab_0 conda-forge pyu2f 0.1.5 pyhd8ed1ab_0 conda-forge re2 2021.04.01 h9c3ff4c_0 conda-forge readline 8.2 h8228510_1 conda-forge requests 2.31.0 pyhd8ed1ab_0 conda-forge requests-oauthlib 1.3.1 pyhd8ed1ab_0 conda-forge rsa 4.9 pyhd8ed1ab_0 conda-forge scikit-learn 0.23.2 py38h5d63f67_3 conda-forge scipy 1.10.1 py38h59b608b_3 conda-forge setuptools 69.0.3 pyhd8ed1ab_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.10 h9fff704_0 conda-forge tensorboard 2.4.1 pyhd8ed1ab_1 conda-forge tensorboard-plugin-wit 1.8.1 pyhd8ed1ab_0 conda-forge tensorflow 2.4.0 py38h578d9bd_0 conda-forge tensorflow-base 2.4.0 py38h01d9eeb_0 conda-forge tensorflow-estimator 2.4.0 pyh9656e83_0 conda-forge termcolor 2.4.0 pyhd8ed1ab_0 conda-forge threadpoolctl 3.2.0 pyha21a80b_0 conda-forge tk 8.6.13 noxft_h4845f30_101 conda-forge tqdm 4.66.1 pyhd8ed1ab_0 conda-forge typing-extensions 4.9.0 hd8ed1ab_0 conda-forge typing_extensions 4.9.0 pyha770c72_0 conda-forge urllib3 2.1.0 pyhd8ed1ab_0 conda-forge werkzeug 3.0.1 pyhd8ed1ab_0 conda-forge wheel 0.42.0 pyhd8ed1ab_0 conda-forge wrapt 1.16.0 py38h01eb140_0 conda-forge xz 5.2.6 h166bdaf_0 conda-forge yarl 1.9.3 py38h01eb140_0 conda-forge zipp 3.17.0 pyhd8ed1ab_0 conda-forge zlib 1.2.13 hd590300_5 conda-forge zstd 1.4.9 ha95c52a_0 conda-forge ```

Thanks for the quick reply!

rhysnewell commented 8 months ago

Ah, okay I was mistaken then. It will be completely updated in v0.9.0 but new installs of aviary should pull the correct version of rosella.

I think the problem lies in the use of channel_priority: strict. I know that Snakemake says in its documentation that channel_priority: strict should be used but I've found with my recipes that this often breaks the environment. If you remove channel_prirority: strict from your .condarc and then delete the checkm2 environment and let aviary rebuild it I have a feeling that everything should work as expected.

The old documentation for aviary used to specify that channel_priority: strict should be set, but we updated it a few months ago to remove mentioning it. Might need to add in some additional comments to the docs if this is indeed the cause of your issue.

ronjasan commented 8 months ago

I have let aviary rebuild the checkm2 environment after removing channel_priority: strict. Now I encounter a different error:

[01/15/2024 04:03:23 AM] INFO: Running CheckM2 version 1.0.2
[01/15/2024 04:03:23 AM] INFO: Running quality prediction workflow with 16 threads.
[01/15/2024 04:03:23 AM] ERROR: Saved models could not be loaded: 'str' object has no attribute 'decode'
Using CheckM2 database /mnt/databases/checkm2_db/CheckM2_database/uniref100.KO.1.dmnd
Full snakemake log ``` Building DAG of jobs... Creating conda environment /mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/../../envs/checkm2.yaml... Downloading and installing remote packages. Environment for /mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/../../envs/checkm2.yaml created (location: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_) Using shell: /usr/bin/bash Provided cores: 24 Rules claiming more threads will be scaled down. Provided resources: mem_mb=256000 Job stats: job count --------------------- ------- checkm2 1 checkm_das_tool 1 checkm_metabat2 1 checkm_rosella 1 checkm_semibin 1 concoct 1 das_tool 1 finalise_stats 1 get_abundances 1 get_bam_indices 1 gtdbtk 1 maxbin2 1 metabat2 1 metabat_sens 1 metabat_spec 1 metabat_ssens 1 metabat_sspec 1 prepare_binning_files 1 recover_mags 1 refine_dastool 1 refine_metabat2 1 refine_rosella 1 refine_semibin 1 rosella 1 semibin 1 singlem_appraise 1 singlem_pipe_reads 1 vamb 1 vamb_jgi_filter 1 total 29 Select jobs to execute... [Mon Jan 15 03:17:33 2024] rule prepare_binning_files: input: /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/maxbin.cov.list, data/coverm.cov log: logs/coverm_prepare.log jobid: 8 reason: Missing output files: data/maxbin.cov.list, data/coverm.cov threads: 24 resources: tmpdir=/home/work/ronjasan, mem_mb=256000, mem_mib=244141, runtime=2880 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/f412d140a3b7a0cf07fb8675dbd26f6d_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/f412d140a3b7a0cf07fb8675dbd26f6d_ [Mon Jan 15 03:24:36 2024] Finished job 8. 1 of 29 steps (3%) done Select jobs to execute... [Mon Jan 15 03:24:36 2024] localrule vamb_jgi_filter: input: /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta, data/coverm.cov output: data/coverm.filt.cov jobid: 23 reason: Missing output files: data/coverm.filt.cov; Input files updated by another job: data/coverm.cov threads: 24 resources: tmpdir=/home/work/ronjasan [Mon Jan 15 03:24:48 2024] Finished job 23. 2 of 29 steps (7%) done Select jobs to execute... [Mon Jan 15 03:24:48 2024] localrule get_bam_indices: input: data/coverm.cov output: data/binning_bams/done jobid: 10 reason: Missing output files: data/binning_bams/done; Input files updated by another job: data/coverm.cov threads: 24 resources: tmpdir=/home/work/ronjasan Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/f412d140a3b7a0cf07fb8675dbd26f6d_ [Mon Jan 15 03:24:57 2024] Finished job 10. 3 of 29 steps (10%) done Select jobs to execute... [Mon Jan 15 03:24:57 2024] rule vamb: input: data/coverm.filt.cov, /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/vamb_bins/done log: logs/vamb.log jobid: 22 benchmark: benchmarks/vamb.benchmark.txt reason: Missing output files: data/vamb_bins/done; Input files updated by another job: data/coverm.filt.cov threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=1440, gpus=0 [Mon Jan 15 03:24:57 2024] rule singlem_pipe_reads: output: data/singlem_out/metagenome.combined_otu_table.csv log: data/singlem_out/singlem_reads_log.txt jobid: 28 reason: Missing output files: data/singlem_out/metagenome.combined_otu_table.csv resources: tmpdir=/home/work/ronjasan, mem_mb=8192, mem_mib=7813, runtime=720 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/7d21ef539b1c9bdc6a5ad5e4218cdfd9_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c629fb56eae7a821805f240c8602c01f_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c629fb56eae7a821805f240c8602c01f_ [Mon Jan 15 03:25:18 2024] Finished job 22. 4 of 29 steps (14%) done Select jobs to execute... [Mon Jan 15 03:25:18 2024] rule metabat_spec: input: data/coverm.cov, /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/metabat_bins_spec/done log: logs/metabat_spec.log jobid: 13 benchmark: benchmarks/metabat_spec.benchmark.txt reason: Missing output files: data/metabat_bins_spec/done; Input files updated by another job: data/coverm.cov threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=1440 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/51d966aaed15db48f02fbc0ec2517ac8_ [Mon Jan 15 03:25:33 2024] Finished job 13. 5 of 29 steps (17%) done Select jobs to execute... [Mon Jan 15 03:25:33 2024] rule maxbin2: input: /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta, data/maxbin.cov.list output: data/maxbin2_bins/done log: logs/maxbin2.log jobid: 11 benchmark: benchmarks/maxbin2.benchmark.txt reason: Missing output files: data/maxbin2_bins/done; Input files updated by another job: data/maxbin.cov.list threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=5760 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/4418820646588dc71380ff6c13f8a873_ [Mon Jan 15 03:25:43 2024] Finished job 28. 6 of 29 steps (21%) done Select jobs to execute... [Mon Jan 15 03:30:10 2024] Finished job 11. 7 of 29 steps (24%) done [Mon Jan 15 03:30:10 2024] rule semibin: input: /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta, data/binning_bams/done output: data/semibin_bins/done log: logs/semibin.log jobid: 21 benchmark: benchmarks/semibin.benchmark.txt reason: Missing output files: data/semibin_bins/done; Input files updated by another job: data/binning_bams/done threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=1440, gpus=0 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/648106278d4a9fab060a4fc996720010_ [Mon Jan 15 03:45:34 2024] Finished job 21. 8 of 29 steps (28%) done Select jobs to execute... [Mon Jan 15 03:45:34 2024] rule metabat_sens: input: data/coverm.cov, /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/metabat_bins_sens/done log: logs/metabat_sens.log jobid: 15 benchmark: benchmarks/metabat_sens.benchmark.txt reason: Missing output files: data/metabat_bins_sens/done; Input files updated by another job: data/coverm.cov threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=1440 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/51d966aaed15db48f02fbc0ec2517ac8_ [Mon Jan 15 03:45:46 2024] Finished job 15. 9 of 29 steps (31%) done Select jobs to execute... [Mon Jan 15 03:45:46 2024] rule checkm_semibin: input: data/semibin_bins/done output: data/semibin_bins/checkm2_out, data/semibin_bins/checkm.out log: logs/checkm_semibin.log jobid: 20 reason: Missing output files: data/semibin_bins/checkm.out; Input files updated by another job: data/semibin_bins/done threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=480, gpus=0 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ [Mon Jan 15 03:47:50 2024] Error in rule checkm_semibin: jobid: 20 input: data/semibin_bins/done output: data/semibin_bins/checkm2_out, data/semibin_bins/checkm.out log: logs/checkm_semibin.log (check log file(s) for error details) conda-env: /mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ RuleException: CalledProcessError in file /mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/binning.smk, line 469: Command 'source /mnt/users/ronjasan/miniforge3/envs/aviary/bin/activate '/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_'; set -euo pipefail; python /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1_recover/.snakemake/scripts/tmp_9gerejo.run_checkm.py' returned non-zero exit status 1. File "/mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/binning.smk", line 469, in __rule_checkm_semibin File "/mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/concurrent/futures/thread.py", line 58, in run Removing output files of failed job checkm_semibin since they might be corrupted: data/semibin_bins/checkm2_out Select jobs to execute... [Mon Jan 15 03:47:50 2024] rule metabat_ssens: input: data/coverm.cov, /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/metabat_bins_ssens/done log: logs/metabat_ssens.log jobid: 14 benchmark: benchmarks/metabat_ssens.benchmark.txt reason: Missing output files: data/metabat_bins_ssens/done; Input files updated by another job: data/coverm.cov threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=1440 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/51d966aaed15db48f02fbc0ec2517ac8_ [Mon Jan 15 03:48:15 2024] Finished job 14. 10 of 29 steps (34%) done Select jobs to execute... [Mon Jan 15 03:48:16 2024] rule concoct: input: /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta, data/binning_bams/done output: data/concoct_bins/done log: logs/concoct.log jobid: 9 benchmark: benchmarks/concoct.benchmark.txt reason: Missing output files: data/concoct_bins/done; Input files updated by another job: data/binning_bams/done threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=5760 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/9868218283cc2a5a4d0b32bdd9430e44_ [Mon Jan 15 04:01:45 2024] Finished job 9. 11 of 29 steps (38%) done Select jobs to execute... [Mon Jan 15 04:01:45 2024] rule metabat2: input: data/coverm.cov, /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/metabat_bins_2/done log: logs/metabat2.log jobid: 7 benchmark: benchmarks/metabat_2.benchmark.txt reason: Missing output files: data/metabat_bins_2/done; Input files updated by another job: data/coverm.cov threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=720 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/51d966aaed15db48f02fbc0ec2517ac8_ [Mon Jan 15 04:01:58 2024] Finished job 7. 12 of 29 steps (41%) done Select jobs to execute... [Mon Jan 15 04:01:58 2024] rule rosella: input: data/coverm.cov, /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/rosella_bins/done log: logs/rosella.log jobid: 18 benchmark: benchmarks/rosella.benchmark.txt reason: Missing output files: data/rosella_bins/done; Input files updated by another job: data/coverm.cov threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=1440 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/f41c22c09b62be695ae31e6ba0d979e2_ [Mon Jan 15 04:02:14 2024] Finished job 18. 13 of 29 steps (45%) done Select jobs to execute... [Mon Jan 15 04:02:14 2024] rule checkm_rosella: input: data/rosella_bins/done output: data/rosella_bins/checkm2_out, data/rosella_bins/checkm.out log: logs/checkm_rosella.log jobid: 17 reason: Missing output files: data/rosella_bins/checkm.out; Input files updated by another job: data/rosella_bins/done threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=480, gpus=0 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ [Mon Jan 15 04:02:36 2024] Finished job 17. 14 of 29 steps (48%) done Select jobs to execute... [Mon Jan 15 04:02:36 2024] rule refine_rosella: input: data/rosella_bins/checkm.out, data/rosella_bins/done, data/coverm.cov, /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/rosella_refined/done log: logs/refine_rosella.log jobid: 16 benchmark: benchmarks/refine_rosella.benchmark.txt reason: Missing output files: data/rosella_refined/done; Input files updated by another job: data/rosella_bins/checkm.out, data/rosella_bins/done, data/coverm.cov threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=4320 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/f41c22c09b62be695ae31e6ba0d979e2_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/f41c22c09b62be695ae31e6ba0d979e2_ [Mon Jan 15 04:02:57 2024] Finished job 16. 15 of 29 steps (52%) done Select jobs to execute... [Mon Jan 15 04:02:57 2024] rule checkm_metabat2: input: data/metabat_bins_2/done output: data/metabat_bins_2/checkm2_out, data/metabat_bins_2/checkm.out log: logs/checkm_metabat2.log jobid: 6 reason: Missing output files: data/metabat_bins_2/checkm.out; Input files updated by another job: data/metabat_bins_2/done threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=480, gpus=0 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ [Mon Jan 15 04:03:24 2024] Error in rule checkm_metabat2: jobid: 6 input: data/metabat_bins_2/done output: data/metabat_bins_2/checkm2_out, data/metabat_bins_2/checkm.out log: logs/checkm_metabat2.log (check log file(s) for error details) conda-env: /mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_ RuleException: CalledProcessError in file /mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/binning.smk, line 444: Command 'source /mnt/users/ronjasan/miniforge3/envs/aviary/bin/activate '/mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_'; set -euo pipefail; python /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1_recover/.snakemake/scripts/tmpv_0ki7do.run_checkm.py' returned non-zero exit status 1. File "/mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/site-packages/aviary/modules/binning/binning.smk", line 444, in __rule_checkm_metabat2 File "/mnt/users/ronjasan/miniforge3/envs/aviary/lib/python3.11/concurrent/futures/thread.py", line 58, in run Removing output files of failed job checkm_metabat2 since they might be corrupted: data/metabat_bins_2/checkm2_out Select jobs to execute... [Mon Jan 15 04:03:24 2024] rule metabat_sspec: input: data/coverm.cov, /net/fs-2/scale/OrionStore/Scratch/ronjasan/Flisa/DNAseq/2_aviary_single_S1/data/final_contigs.fasta output: data/metabat_bins_sspec/done log: logs/metabat_sspec.log jobid: 12 benchmark: benchmarks/metabat_sspec.benchmark.txt reason: Missing output files: data/metabat_bins_sspec/done; Input files updated by another job: data/coverm.cov threads: 16 resources: tmpdir=/home/work/ronjasan, mem_mb=131072, mem_mib=125000, runtime=1440 Activating conda environment: ../../../../../../../../../mnt/users/ronjasan/miniforge3/envs/aviary/51d966aaed15db48f02fbc0ec2517ac8_ [Mon Jan 15 04:03:35 2024] Finished job 12. 16 of 29 steps (55%) done Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2024-01-15T030014.970783.snakemake.log ```
rhysnewell commented 8 months ago

Looks to be related to this issue with no current resolution: https://github.com/chklovski/CheckM2/issues/65 I'll keep looking into it, but it would seem checkm2 is unable to open it's CNN models for whatever reason.

Would you please post the current list of software installed in the checkm2 conda environment? I might be able to spot something obvious. The main offending package is likely to be scikit-learn, if you can ensure that its version is scikit-learn==0.23.2 as I think it handles the unpickling of the models within checkm2

Apologies for the inconvenience

ronjasan commented 8 months ago

Of course, here are all the packages currently installed in the checkm2 environment:

Packages ``` # packages in environment at /mnt/users/ronjasan/miniforge3/envs/aviary/c0310444dbde1742ee364906339cb3c7_: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge abseil-cpp 20200923.3 h9c3ff4c_0 conda-forge absl-py 2.0.0 pyhd8ed1ab_0 conda-forge aiohttp 3.9.1 py38h01eb140_0 conda-forge aiosignal 1.3.1 pyhd8ed1ab_0 conda-forge astor 0.8.1 pyh9f0ad1d_0 conda-forge astunparse 1.6.3 pyhd8ed1ab_0 conda-forge async-timeout 4.0.3 pyhd8ed1ab_0 conda-forge attrs 23.2.0 pyh71513ae_0 conda-forge blinker 1.7.0 pyhd8ed1ab_0 conda-forge boost-cpp 1.70.0 h7b93d67_3 conda-forge brotli-python 1.1.0 py38h17151c0_1 conda-forge bzip2 1.0.8 hd590300_5 conda-forge c-ares 1.25.0 hd590300_0 conda-forge ca-certificates 2023.11.17 hbcca054_0 conda-forge cachetools 4.2.4 pyhd8ed1ab_0 conda-forge certifi 2023.11.17 pyhd8ed1ab_0 conda-forge cffi 1.16.0 py38h6d47a40_0 conda-forge charset-normalizer 3.3.2 pyhd8ed1ab_0 conda-forge checkm2 1.0.2 pypi_0 pypi click 8.1.7 unix_pyh707e725_0 conda-forge colorama 0.4.6 pyhd8ed1ab_0 conda-forge cryptography 39.0.0 py38h1724139_0 conda-forge diamond 2.0.4 h56fc30b_0 bioconda frozenlist 1.4.1 py38h01eb140_0 conda-forge gast 0.3.3 py_0 conda-forge giflib 5.2.1 h0b41bf4_3 conda-forge google-auth 1.35.0 pyh6c4a22f_0 conda-forge google-auth-oauthlib 0.4.6 pyhd8ed1ab_0 conda-forge google-pasta 0.2.0 pyh8c360ce_0 conda-forge grpc-cpp 1.36.4 hf89561c_1 conda-forge grpcio 1.36.1 py38hdd6454d_0 conda-forge h5py 2.10.0 nompi_py38h9915d05_106 conda-forge hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge icu 67.1 he1b5a44_0 conda-forge idna 3.6 pyhd8ed1ab_0 conda-forge importlib-metadata 7.0.1 pyha770c72_0 conda-forge joblib 1.3.2 pyhd8ed1ab_0 conda-forge jpeg 9e h0b41bf4_3 conda-forge keras-preprocessing 1.1.2 pyhd8ed1ab_0 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.20.1 hf9c8cef_0 conda-forge ld_impl_linux-64 2.40 h41732ed_0 conda-forge libblas 3.9.0 20_linux64_openblas conda-forge libcblas 3.9.0 20_linux64_openblas conda-forge libcurl 7.87.0 h6312ad2_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 hd590300_2 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 13.2.0 h807b86a_3 conda-forge libgfortran-ng 13.2.0 h69a702a_3 conda-forge libgfortran5 13.2.0 ha4646dd_3 conda-forge libgomp 13.2.0 h807b86a_3 conda-forge liblapack 3.9.0 20_linux64_openblas conda-forge libnghttp2 1.51.0 hdcd2b5c_0 conda-forge libnsl 2.0.1 hd590300_0 conda-forge libopenblas 0.3.25 pthreads_h413a1c8_0 conda-forge libpng 1.6.39 h753d276_0 conda-forge libprotobuf 3.15.8 h780b84a_1 conda-forge libsqlite 3.44.2 h2797004_0 conda-forge libssh2 1.10.0 haa6b8db_3 conda-forge libstdcxx-ng 13.2.0 h7e041cc_3 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libzlib 1.2.13 hd590300_5 conda-forge lightgbm 3.2.1 py38h709712a_0 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge markdown 3.5.2 pyhd8ed1ab_0 conda-forge markupsafe 2.1.3 py38h01eb140_1 conda-forge multidict 6.0.4 py38h01eb140_1 conda-forge ncurses 6.4 h59595ed_2 conda-forge numpy 1.21.6 py38h1d589f8_0 conda-forge oauthlib 3.2.2 pyhd8ed1ab_0 conda-forge openssl 1.1.1w hd590300_0 conda-forge opt_einsum 3.3.0 pyhc1e730c_2 conda-forge packaging 23.2 pyhd8ed1ab_0 conda-forge pandas 1.5.3 py38hdc8b05c_1 conda-forge pip 23.3.2 pyhd8ed1ab_0 conda-forge platformdirs 4.1.0 pyhd8ed1ab_0 conda-forge pooch 1.8.0 pyhd8ed1ab_0 conda-forge prodigal 2.6.3 h031d066_7 bioconda protobuf 3.15.8 py38h709712a_0 conda-forge pyasn1 0.5.1 pyhd8ed1ab_0 conda-forge pyasn1-modules 0.3.0 pyhd8ed1ab_0 conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pyjwt 2.8.0 pyhd8ed1ab_0 conda-forge pyopenssl 23.2.0 pyhd8ed1ab_1 conda-forge pysocks 1.7.1 pyha2e5f31_6 conda-forge python 3.8.15 h257c98d_0_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python-flatbuffers 1.12 pyhd8ed1ab_1 conda-forge python_abi 3.8 4_cp38 conda-forge pytz 2023.3.post1 pyhd8ed1ab_0 conda-forge pyu2f 0.1.5 pyhd8ed1ab_0 conda-forge re2 2021.04.01 h9c3ff4c_0 conda-forge readline 8.2 h8228510_1 conda-forge requests 2.31.0 pyhd8ed1ab_0 conda-forge requests-oauthlib 1.3.1 pyhd8ed1ab_0 conda-forge rsa 4.9 pyhd8ed1ab_0 conda-forge scikit-learn 0.23.2 py38h5d63f67_3 conda-forge scipy 1.10.1 py38h59b608b_3 conda-forge setuptools 69.0.3 pyhd8ed1ab_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.10 h9fff704_0 conda-forge tensorboard 2.4.1 pyhd8ed1ab_1 conda-forge tensorboard-plugin-wit 1.8.1 pyhd8ed1ab_0 conda-forge tensorflow 2.4.0 py38h578d9bd_0 conda-forge tensorflow-base 2.4.0 py38h01d9eeb_0 conda-forge tensorflow-estimator 2.4.0 pyh9656e83_0 conda-forge termcolor 2.4.0 pyhd8ed1ab_0 conda-forge threadpoolctl 3.2.0 pyha21a80b_0 conda-forge tk 8.6.13 noxft_h4845f30_101 conda-forge tqdm 4.66.1 pyhd8ed1ab_0 conda-forge typing-extensions 4.9.0 hd8ed1ab_0 conda-forge typing_extensions 4.9.0 pyha770c72_0 conda-forge urllib3 2.1.0 pyhd8ed1ab_0 conda-forge werkzeug 3.0.1 pyhd8ed1ab_0 conda-forge wheel 0.42.0 pyhd8ed1ab_0 conda-forge wrapt 1.16.0 py38h01eb140_0 conda-forge xz 5.2.6 h166bdaf_0 conda-forge yarl 1.9.3 py38h01eb140_0 conda-forge zipp 3.17.0 pyhd8ed1ab_0 conda-forge zlib 1.2.13 hd590300_5 conda-forge zstd 1.4.9 ha95c52a_0 conda-forge ```

The installed version is scikit-learn==0.23.2, so that should be okay.

rhysnewell commented 8 months ago

Very weird, I'll need to cross check the packages in a working environment during my working hours. It looks like you have a valid version of h5py installed as well, but I wonder if a force install might help.

If you active that checkm2 env and run:

which python

and if that points to the correct python path (it should be the one in the checkm2 conda env). If it does, then run this:

pip install 'h5py==2.10.0' --force-reinstall

and see if that fixes anything. Apparently the error is coming from h5py, but it shouldn't be occurring with the version you have installed so it's a bit odd.

ronjasan commented 8 months ago

I tried --force-reinstall for h5py, but that only resulted in multiple errors.

force reinstall ``` [ronjasan@login 4a1f6cc0815cae56a10728024f448aae_]$ pip install 'h5py==2.10.0' --force-reinstall Collecting h5py==2.10.0 Using cached h5py-2.10.0-cp38-cp38-manylinux1_x86_64.whl (2.9 MB) Collecting numpy>=1.7 (from h5py==2.10.0) Using cached numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB) Collecting six (from h5py==2.10.0) Using cached six-1.16.0-py2.py3-none-any.whl (11 kB) Using cached numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB) Installing collected packages: six, numpy, h5py Attempting uninstall: six Found existing installation: six 1.16.0 Uninstalling six-1.16.0: Successfully uninstalled six-1.16.0 Attempting uninstall: numpy Found existing installation: numpy 1.21.6 Uninstalling numpy-1.21.6: Successfully uninstalled numpy-1.21.6 Attempting uninstall: h5py Found existing installation: h5py 2.10.0 Uninstalling h5py-2.10.0: Successfully uninstalled h5py-2.10.0 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tensorflow 2.4.0 requires absl-py~=0.10, but you have absl-py 2.0.0 which is incompatible. tensorflow 2.4.0 requires six~=1.15.0, but you have six 1.16.0 which is incompatible. tensorflow 2.4.0 requires termcolor~=1.1.0, but you have termcolor 2.4.0 which is incompatible. tensorflow 2.4.0 requires typing-extensions~=3.7.4, but you have typing-extensions 4.9.0 which is incompatible. tensorflow 2.4.0 requires wrapt~=1.12.1, but you have wrapt 1.16.0 which is incompatible. Successfully installed h5py-2.10.0 numpy-1.24.4 six-1.16.0 ```

It also ruined the environment, resulting in AttributeError: module 'numpy' has no attribute 'object' when I ran checkm2 -h inside the active environment.

However, I found a fix that seems to be working. I swapped out the checkm2.yml file from aviary with the .yml file from checkm2. Then I built the new checkm2 environment with aviary --build, activated the environment and installed checkm2 with pip install CheckM2. I am running the pipeline now, and it has run both rule checkm_semibin and rule checkm_metabat2 successfully.

Thanks for the help, and I'm looking forward to continue using aviary!

rhysnewell commented 8 months ago

Good work! Glad the pipeline is working for you now.

Okay cool, yeah that's what we originally did to get the checkm2 env working but it must have been updated without me realising. Thank you for documenting your fix, I'll see about implementing it in.

Closing this issue for now