marbl / MetaCompass

MetaCompass: Reference-guided Assembly of Metagenomes
https://github.com/marbl/MetaCompass/wiki
Other
38 stars 11 forks source link

Error in rule polish_contigs (ntHits?) #27

Open lxsteiner opened 10 months ago

lxsteiner commented 10 months ago

Hi,

was running the tutorial example:

$ python3 go_metacompass.py -r tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta -1 tutorial/thao2000.1.fq -2 tutorial/thao2000.2.fq -l 150 -o tutorial/example1_output -t 30 --notimestamps

but it failed:

REFERENCE genome file provided. Reference Selection step will be skipped.
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 30
Rules claiming more threads will be scaled down.
Job stats:
job                     count
--------------------  -------
all                         1
assemble_unmapped           1
assembled_references        1
bowtie2_map                 1
build_contigs               1
create_tsv                  1
join_contigs                1
mapping_stats               1
polish_contigs              1
polish_map                  1
stats_all                   1
stats_genome                1
total                      12

Select jobs to execute...

[Mon Dec 18 15:14:24 2023]
Job 5: ---Build index .
Reason: Missing output files: tutorial/example1_output/logs/bowtie2map.log, tutorial/example1_output/assembly/mc.sam

bowtie2-build -o 3 --threads 30 -q tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta tutorial/example1_output/assembly/mc.index 1>> tutorial/example1_output/assembly/mc.index 2>&1;bowtie2 -a --end-to-end --sensitive --no-unal -p 30 -x tutorial/example1_output/assembly/mc.index -q -U tutorial/thao2000.1.fq,tutorial/thao2000.2.fq -S tutorial/example1_output/assembly/mc.sam.all > tutorial/example1_output/logs/bowtie2map.log 2>&1; python3 /media/5c679734-9376-4617-815c-d4bd4177b8b2/leon/projects/01/soft/MetaCompass/bin/best_strata.py tutorial/example1_output/assembly/mc.sam.all tutorial/example1_output/assembly/mc.sam; rm tutorial/example1_output/assembly/mc.sam.all && touch tutorial/example1_output/assembly/.run1.ok
[Mon Dec 18 15:14:30 2023]
Finished job 5.
1 of 12 steps (8%) done
Select jobs to execute...

[Mon Dec 18 15:14:30 2023]
Job 4: ---Build contigs .
Reason: Missing output files: tutorial/example1_output/assembly/selected_maps.sam, tutorial/example1_output/assembly/contigs.fasta; Input files updated by another job: tutorial/example1_output/assembly/mc.sam

/media/5c679734-9376-4617-815c-d4bd4177b8b2/leon/projects/01/soft/MetaCompass/bin/buildcontig -r tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta -s tutorial/example1_output/assembly/mc.sam -o tutorial/example1_output/assembly -c 1 -l 1 -n F -b F -u F -k breadth  1>> tutorial/example1_output/logs/buildcontigs.log 2>&1 && touch tutorial/example1_output/assembly/.run2.ok
[Mon Dec 18 15:14:31 2023]
Finished job 4.
2 of 12 steps (17%) done
Select jobs to execute...

[Mon Dec 18 15:14:32 2023]
Job 3: ---ntEDit polish contigs .
Reason: Missing output files: tutorial/example1_output/error_correction/contigs_edited.fa; Input files updated by another job: tutorial/example1_output/assembly/contigs.fasta

Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 30
Rules claiming more threads will be scaled down.
Select jobs to execute...
touch tutorial/example1_output/error_correction/contigs_edited.fa
/usr/bin/time -v -o tutorial/example1_output/error_correction/solidBF_c1.time nthits -c1 -b 36 -k 25 -t16 --outbloom tutorial/thao2000.1.fq tutorial/thao2000.2.fq -p tutorial/example1_output/error_correction/solidBF_c1> tutorial/example1_output/logs/polish.log 2>&1
[Mon Dec 18 15:14:33 2023]
Error in rule polish_contigs:
    jobid: 0
    input: tutorial/example1_output/assembly/contigs.fasta, tutorial/thao2000.1.fq, tutorial/thao2000.2.fq
    output: tutorial/example1_output/error_correction/contigs_edited.fa
    log: tutorial/example1_output/logs/polish.log (check log file(s) for error details)

RuleException:
CalledProcessError in file /media/5c679734-9376-4617-815c-d4bd4177b8b2/leon/projects/01/soft/MetaCompass/snakemake/metacompass.ref.paired.py, line 134:
Command 'set -euo pipefail;  /usr/bin/time -v -o tutorial/example1_output/error_correction/solidBF_c1.time nthits -c1 -b 36 -k 25 -t16 --outbloom tutorial/thao2000.1.fq tutorial/thao2000.2.fq -p tutorial/example1_output/error_correction/solidBF_c1> tutorial/example1_output/logs/polish.log 2>&1' returned non-zero exit status 1.
  File "/media/5c679734-9376-4617-815c-d4bd4177b8b2/leon/projects/01/soft/MetaCompass/snakemake/metacompass.ref.paired.py", line 134, in __rule_polish_contigs
  File "/home/leon/miniconda3/envs/metacompass_env/lib/python3.10/concurrent/futures/thread.py", line 58, in run
Removing output files of failed job polish_contigs since they might be corrupted:
tutorial/example1_output/error_correction/contigs_edited.fa
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-12-18T151423.726751.snakemake.log
ERROR: snakemake command failed; exiting..

tutorial/example1_output/logs/polish.log

Unknown argument: -c1
Usage: ntHits --frequencies VAR --out-file VAR [--min-count VAR] [--max-count VAR] [--kmer-length VAR] [--seeds VAR] [-h] [--error-rate VAR] [--threads VAR] [--solid] [--long-mode] out_type files

Filters k-mers based on counts (cmin <= count <= cmax) in input files

Positional arguments:
  out_type              Output format: Bloom filter 'bf', counting Bloom filter ('cbf'), or table ('table') [required]
  files                 Input files [nargs: 0 or more] [required]

Optional arguments:
  -f, --frequencies     Frequency histogram file (e.g. from ntCard) [required]
  -o, --out-file        Output file's name [required]
  -cmin, --min-count    Minimum k-mer count (>=1), ignored if using --solid [default: 1]
  -cmax, --max-count    Maximum k-mer count (<=254) [default: 254]
  -k, --kmer-length     k-mer length, ignored if using spaced seeds (-s) [default: 64]
  -s, --seeds           If specified, use spaced seeds (separate with commas, e.g. 10101,11011)
  -h, --num-hashes      Number of hashes to generate per k-mer/spaced seed [default: 3]
  -p, --error-rate      Target Bloom filter error rate [default: 0.0001]
  -t, --threads         Number of parallel threads [default: 4]
  --solid               Automatically tune 'cmin' to filter out erroneous k-mers
  --long-mode           Optimize data reader for long sequences (>5kbp)
  -v                    Level of details printed to stdout (-v: normal, -vv detailed)

Copyright 2023 Canada's Michael Smith Genome Science Centre

Installed the dependencies in a conda environment:

 mamba create -n metacompass_env -c conda-forge -c bioconda "python=>3.1" biopython "snakemake>=3.7.1" "blast>=2.4.0" "bowtie2>=2.2.9" "mash>=2.1" "samtools>=1.2.13" "megahit>=1.0.6" nthits ntedit meryl

here are the versions:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
aioeasywebdav             2.4.0              pyha770c72_0    conda-forge
aiohttp                   3.9.1           py310h2372a71_0    conda-forge
aiosignal                 1.3.1              pyhd8ed1ab_0    conda-forge
amply                     0.1.6              pyhd8ed1ab_0    conda-forge
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
async-timeout             4.0.3              pyhd8ed1ab_0    conda-forge
attmap                    0.13.2             pyhd8ed1ab_0    conda-forge
attrs                     23.1.0             pyh71513ae_1    conda-forge
bcrypt                    4.1.2           py310hcb5633a_0    conda-forge
**biopython                 1.81            py310h2372a71_1    conda-forge**
**blast                     2.15.0          pl5321h6f7f691_1    bioconda**
boto3                     1.34.2             pyhd8ed1ab_0    conda-forge
botocore                  1.34.2             pyhd8ed1ab_0    conda-forge
**bowtie2                   2.5.2           py310ha0a81b8_0    bioconda**
brotli-python             1.1.0           py310hc6cd4ac_1    conda-forge
btllib                    1.7.0           py310h0dbaff4_0    bioconda
bzip2                     1.0.8                hd590300_5    conda-forge
c-ares                    1.24.0               hd590300_0    conda-forge
ca-certificates           2023.11.17           hbcca054_0    conda-forge
cachetools                5.3.2              pyhd8ed1ab_0    conda-forge
capnproto                 0.9.1                ha19adfc_4    conda-forge
certifi                   2023.11.17         pyhd8ed1ab_0    conda-forge
cffi                      1.16.0          py310h2fee648_0    conda-forge
charset-normalizer        3.3.2              pyhd8ed1ab_0    conda-forge
coin-or-cbc               2.10.10              h9002f0b_0    conda-forge
coin-or-cgl               0.60.7               h516709c_0    conda-forge
coin-or-clp               1.17.8               h1ee7a9c_0    conda-forge
coin-or-osi               0.108.8              ha2443b9_0    conda-forge
coin-or-utils             2.11.9               hee58242_0    conda-forge
coincbc                   2.10.10           0_metapackage    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
configargparse            1.7                pyhd8ed1ab_0    conda-forge
connection_pool           0.0.3              pyhd3deb0d_0    conda-forge
cryptography              41.0.7          py310hb8475ec_1    conda-forge
curl                      8.5.0                hca28451_0    conda-forge
datrie                    0.8.2           py310h2372a71_7    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
docutils                  0.20.1          py310hff52083_3    conda-forge
dpath                     2.1.6              pyha770c72_0    conda-forge
dropbox                   11.36.2            pyhd8ed1ab_0    conda-forge
eido                      0.2.2              pyhd8ed1ab_0    conda-forge
entrez-direct             16.2                 he881be0_1    bioconda
exceptiongroup            1.2.0              pyhd8ed1ab_0    conda-forge
filechunkio               1.8                        py_2    conda-forge
frozenlist                1.4.1           py310h2372a71_0    conda-forge
ftputil                   5.0.4              pyhd8ed1ab_0    conda-forge
gettext                   0.21.1               h27087fc_0    conda-forge
gitdb                     4.0.11             pyhd8ed1ab_0    conda-forge
gitpython                 3.1.40             pyhd8ed1ab_0    conda-forge
google-api-core           2.15.0             pyhd8ed1ab_0    conda-forge
google-api-python-client  2.111.0            pyhd8ed1ab_0    conda-forge
google-auth               2.25.2             pyhca7485f_0    conda-forge
google-auth-httplib2      0.2.0              pyhd8ed1ab_0    conda-forge
google-cloud-core         2.4.1              pyhd8ed1ab_0    conda-forge
google-cloud-storage      2.14.0             pyhca7485f_0    conda-forge
google-crc32c             1.1.2           py310hc5c09a0_5    conda-forge
google-resumable-media    2.7.0              pyhd8ed1ab_0    conda-forge
googleapis-common-protos  1.62.0             pyhd8ed1ab_0    conda-forge
grpcio                    1.60.0          py310h1b8f574_0    conda-forge
gsl                       2.7                  he838d99_0    conda-forge
gzip                      1.13                 hd590300_0    conda-forge
htslib                    1.19                 h81da01d_0    bioconda
httplib2                  0.22.0             pyhd8ed1ab_0    conda-forge
humanfriendly             10.0               pyhd8ed1ab_6    conda-forge
icu                       73.2                 h59595ed_0    conda-forge
idna                      3.6                pyhd8ed1ab_0    conda-forge
importlib_resources       6.1.1              pyhd8ed1ab_0    conda-forge
iniconfig                 2.0.0              pyhd8ed1ab_0    conda-forge
jinja2                    3.1.2              pyhd8ed1ab_1    conda-forge
jmespath                  1.0.1              pyhd8ed1ab_0    conda-forge
jsonschema                4.20.0             pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2023.11.2          pyhd8ed1ab_0    conda-forge
jupyter_core              5.5.0           py310hff52083_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
krb5                      1.21.2               h659d440_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
libabseil                 20230802.1      cxx17_h59595ed_0    conda-forge
libblas                   3.9.0           20_linux64_openblas    conda-forge
libcblas                  3.9.0           20_linux64_openblas    conda-forge
libcrc32c                 1.1.2                h9c3ff4c_0    conda-forge
libcurl                   8.5.0                hca28451_0    conda-forge
libdeflate                1.19                 hd590300_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc                    7.2.0                h69d50b8_2    conda-forge
libgcc-ng                 13.2.0               h807b86a_3    conda-forge
libgfortran-ng            13.2.0               h69a702a_3    conda-forge
libgfortran5              13.2.0               ha4646dd_3    conda-forge
libgomp                   13.2.0               h807b86a_3    conda-forge
libgrpc                   1.60.0               hd6c4280_0    conda-forge
libhwloc                  2.9.3           default_h554bfaf_1009    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libidn2                   2.3.4                h166bdaf_0    conda-forge
liblapack                 3.9.0           20_linux64_openblas    conda-forge
liblapacke                3.9.0           20_linux64_openblas    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libopenblas               0.3.25          pthreads_h413a1c8_0    conda-forge
libprotobuf               4.24.4               hf27288f_0    conda-forge
libre2-11                 2023.06.02           h7a70373_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libsqlite                 3.44.2               h2797004_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              13.2.0               h7e041cc_3    conda-forge
libunistring              0.9.10               h7f98852_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libxml2                   2.11.6               h232c23b_0    conda-forge
libzlib                   1.2.13               hd590300_5    conda-forge
logmuse                   0.2.6              pyh8c360ce_0    conda-forge
lrzip                     0.621                hedc9cd1_7    bioconda
lzo                       2.10              h516909a_1000    conda-forge
markdown-it-py            3.0.0              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.3           py310h2372a71_1    conda-forge
**mash                      2.3                  ha9a2dd8_3    bioconda**
mdurl                     0.1.0              pyhd8ed1ab_0    conda-forge
**megahit                   1.2.9                h43eeafb_4    bioconda**
meryl                     2013                          0    bioconda
multidict                 6.0.4           py310h2372a71_1    conda-forge
nbformat                  5.9.2              pyhd8ed1ab_0    conda-forge
ncbi-vdb                  3.0.9                hdbdd923_0    bioconda
ncurses                   6.4                  h59595ed_2    conda-forge
**ntedit                    1.3.5                hd03093a_1    bioconda**
**nthits                    1.0.2                h4ac6f70_0    bioconda**
numpy                     1.26.2          py310hb13e2d6_0    conda-forge
oauth2client              4.1.3                      py_0    conda-forge
openssl                   3.2.0                hd590300_1    conda-forge
ossuuid                   1.6.2             hf484d3e_1000    conda-forge
packaging                 23.2               pyhd8ed1ab_0    conda-forge
pandas                    2.1.4           py310hcc13569_0    conda-forge
paramiko                  3.3.1              pyhd8ed1ab_0    conda-forge
pcre                      8.45                 h9c3ff4c_0    conda-forge
peppy                     0.35.7             pyhd8ed1ab_0    conda-forge
perl                      5.22.2.1                      0    conda-forge
perl-archive-tar          2.18                          1    bioconda
perl-common-sense         3.74                          0    bioconda
perl-exporter-tiny        0.042                         1    bioconda
perl-json                 2.90                          1    bioconda
perl-json-xs              2.34                          0    bioconda
perl-list-moreutils       0.413                         1    bioconda
perl-threaded             5.32.1               hdfd78af_1    bioconda
perl-uri                  1.71                          0    bioconda
perl-xml-libxml           2.0124                        0    bioconda
perl-xml-namespacesupport 1.11                          0    bioconda
perl-xml-sax              0.99                          0    bioconda
perl-xml-sax-base         1.08                          0    bioconda
pigz                      2.8                  h2797004_0    conda-forge
pip                       23.3.2             pyhd8ed1ab_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
plac                      1.4.2              pyhd8ed1ab_0    conda-forge
platformdirs              4.1.0              pyhd8ed1ab_0    conda-forge
pluggy                    1.3.0              pyhd8ed1ab_0    conda-forge
ply                       3.11                       py_1    conda-forge
prettytable               3.9.0              pyhd8ed1ab_0    conda-forge
protobuf                  4.24.4          py310h620c231_0    conda-forge
psutil                    5.9.7           py310h2372a71_0    conda-forge
pulp                      2.7.0           py310hff52083_1    conda-forge
pyasn1                    0.5.1              pyhd8ed1ab_0    conda-forge
pyasn1-modules            0.3.0              pyhd8ed1ab_0    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pygments                  2.17.2             pyhd8ed1ab_0    conda-forge
pynacl                    1.5.0           py310h2372a71_3    conda-forge
pyopenssl                 23.3.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.1.1              pyhd8ed1ab_0    conda-forge
pysftp                    0.2.9                      py_1    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
pytest                    7.4.3              pyhd8ed1ab_0    conda-forge
**python                    3.10.13         hd12c33a_0_cpython    conda-forge**
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.19.0             pyhd8ed1ab_0    conda-forge
python-irodsclient        1.1.9              pyhd8ed1ab_0    conda-forge
python-tzdata             2023.3             pyhd8ed1ab_0    conda-forge
python_abi                3.10                    4_cp310    conda-forge
pytz                      2023.3.post1       pyhd8ed1ab_0    conda-forge
pyu2f                     0.1.5              pyhd8ed1ab_0    conda-forge
pyyaml                    6.0.1           py310h2372a71_1    conda-forge
re2                       2023.06.02           h2873b5e_0    conda-forge
readline                  8.2                  h8228510_1    conda-forge
referencing               0.32.0             pyhd8ed1ab_0    conda-forge
requests                  2.31.0             pyhd8ed1ab_0    conda-forge
reretry                   0.11.8             pyhd8ed1ab_0    conda-forge
rich                      13.7.0             pyhd8ed1ab_0    conda-forge
rpds-py                   0.13.2          py310hcb5633a_0    conda-forge
rsa                       4.9                pyhd8ed1ab_0    conda-forge
s3transfer                0.9.0              pyhd8ed1ab_0    conda-forge
**samtools                  1.19                 h50ea8bc_0    bioconda**
setuptools                68.2.2             pyhd8ed1ab_0    conda-forge
setuptools-scm            8.0.4              pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
slacker                   0.14.0                     py_0    conda-forge
smart_open                6.4.0              pyhd8ed1ab_0    conda-forge
smmap                     5.0.0              pyhd8ed1ab_0    conda-forge
**snakemake                 7.32.4               hdfd78af_1    bioconda**
snakemake-minimal         7.32.4             pyhdfd78af_1    bioconda
stone                     3.3.1              pyhd8ed1ab_0    conda-forge
stopit                    1.1.2                      py_0    conda-forge
tabulate                  0.9.0              pyhd8ed1ab_1    conda-forge
tar                       1.34                 hb2e2bae_1    conda-forge
tbb                       2021.11.0            h00ab1b0_0    conda-forge
throttler                 1.2.2              pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
toposort                  1.10               pyhd8ed1ab_0    conda-forge
traitlets                 5.14.0             pyhd8ed1ab_0    conda-forge
typing-extensions         4.9.0                hd8ed1ab_0    conda-forge
typing_extensions         4.9.0              pyha770c72_0    conda-forge
tzdata                    2023c                h71feb2d_0    conda-forge
ubiquerg                  0.6.3              pyhd8ed1ab_0    conda-forge
uritemplate               4.1.1              pyhd8ed1ab_0    conda-forge
urllib3                   1.26.18            pyhd8ed1ab_0    conda-forge
veracitools               0.1.3                      py_0    conda-forge
wcwidth                   0.2.12             pyhd8ed1ab_0    conda-forge
wget                      1.20.3               ha35d2d1_1    conda-forge
wheel                     0.42.0             pyhd8ed1ab_0    conda-forge
wrapt                     1.16.0          py310h2372a71_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
yarl                      1.9.3           py310h2372a71_0    conda-forge
yte                       1.5.4              pyha770c72_0    conda-forge
zip                       3.0                  hd590300_3    conda-forge
zipp                      3.17.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               hd590300_5    conda-forge
zstd                      1.5.5                hfc55251_0    conda-forge

Is there a version requirement for nthits (because it's not specified) or where does the error come from?

Also, what is go_metacompass2.py and how does it differ from go_metacompass.py ?

Thanks!