a-h-b / binny

GNU General Public License v3.0
28 stars 6 forks source link

Issue with TESTRUN #38

Closed jp589 closed 2 years ago

jp589 commented 2 years ago

Hello,

I'm very interested in using binny. I've gone through the Quickstart guide, downloaded and installed everything successfully, but the test run in step 4 does not complete. When I enter ./binny -l -n "TESTRUN" -r config/config.test.yaml everything completes except for Python Binny. I have tried a fresh installation with no modifications including a fresh installation from the manuscript version. Neither has succeeded. I am installing binny into an area of a high performance grid of my university running through Slurm.

The terminal output is as follows.

(base) [ab1234@abc1 binny]$ ./binny -l -n "TESTRUN" -r config/config.test.yaml Will use conda source path: ./binny/conda Running workflow in current session - don't use this setting except with small datasets and databases. ./binny/conda Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 1 (use --cores to define parallelism) Rules claiming more threads will be scaled down. Job stats: job count min threads max threads


ALL 1 1 1 binny 1 1 1 total 2 1 1

Select jobs to execute...

[Fri Sep 30 13:31:56 2022] Job 1: binny: Running Python Binny.

Activating conda environment: ./binny/conda/48ad8fcdeff72f1a87686f510a50eb8e Activating conda environment: ./binny/conda/48ad8fcdeff72f1a87686f510a50eb8e Traceback (most recent call last): File "./binny/test_output/.snakemake/scripts/tmpci0qaonu.binny_main.py", line 58, in from binny_functions import * File "./binny/workflow/scripts/binnyfunctions.py", line 12, in import hdbscan File "./binny/conda/48ad8fcdeff72f1a87686f510a50eb8e/lib/python3.8/site-packages/hdbscan/init.py", line 1, in from .hdbscan import HDBSCAN, hdbscan File "./binny/conda/48ad8fcdeff72f1a87686f510a50eb8e/lib/python3.8/site-packages/hdbscan/hdbscan_.py", line 509, in memory=Memory(cachedir=None, verbose=0), TypeError: init() got an unexpected keyword argument 'cachedir' [Fri Sep 30 13:33:01 2022] Error in rule binny: jobid: 1 output: ./binny/test_output/binny.done log: ./binny/test_output/logs/binning_binny.log (check log file(s) for error message) conda-env: ./binny/conda/48ad8fcdeff72f1a87686f510a50eb8e

Traceback (most recent call last): File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/executors/init.py", line 593, in _callback raise ex File "./binny/conda/snakemake_env/lib/python3.8/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, *self.kwargs) File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/executors/init.py", line 579, in cached_or_run run_func(args) File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/executors/init.py", line 2460, in run_wrapper raise ex File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/executors/init.py", line 2357, in run_wrapper run( File "./binny/Snakefile", line 668, in __rule_binny File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/script.py", line 1369, in script executor.evaluate() File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/script.py", line 381, in evaluate self.execute_script(fd.name, edit=edit) File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/script.py", line 582, in execute_script self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname) File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/script.py", line 414, in _execute_cmd return shell( File "./binny/conda/snakemake_env/lib/python3.8/site-packages/snakemake/shell.py", line 265, in new raise sp.CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'source ./binny/conda/snakemake_env/bin/activate './binny/conda/48ad8fcdeff72f1a87686f510a50eb8e'; set -euo pipefail; python ./binny/test_output/.snakemake/scripts/tmpci0qaonu.binny_main.py' returned non-zero exit status 1. Job failed, going on with independent jobs. Exiting because a job execution failed. Look above for error message Complete log: ./binny/.snakemake/log/2022-09-30T133112.442625.snakemake.log Building DAG of jobs... Creating report... Missing metadata for file ./binny/test_output/binny.done. Maybe metadata was deleted or it was created using an older version of Snakemake. This is a non critical warning. Downloading resources and rendering HTML. Report created: report.html.

While the output mentions a snakemake.log, it is basically an abbreviated version of this output. Also, the binning_binny.log which is indicated, exists, but is empty.

Any help identifying the cause of this error would be much appreciated! Thanks!

ohickl commented 2 years ago

Hi, thanks for your interest! I could reproduce it and it is a problem with HDBSCAN, see here. Its unfortunately out of our hands atm. We will release an update as soon as they fix it. In the meantime you could circumvent it by doing:

conda_envs_path="path/to/binny/envs/conda" # e.g. default "path/to/binny/dir/conda"
env_manager="mamba" # or "conda"
for i in ${conda_envs_path}/*.yaml; do
  env_name=$(head -n 1 ${i} | cut -d' ' -f2)
  if [[ ${env_name} == 'binny_linux' ]]; then
    binny_env=${i%.yaml}
    echo "Loading binny env from: ${mantis_env}"
    ${env_manager} activate ${binny_env}
  fi
done

${env_manager} install joblib==1.1.0 --yes
${env_manager} deactivate

But be warned that the joblib package with versions 1.1.0 and below seems to have security problems, as mentioned in the HDBSCAN issue.

Best, Oskar

jp589 commented 2 years ago

Thank you very much for your quick reply and suggested fix.

I was able to get the test run to finish with the downgrade of joblib to 1.1.0.

I will keep an eye out for your future update.