marbl / MetaCompass

MetaCompass: Reference-guided Assembly of Metagenomes
https://github.com/marbl/MetaCompass/wiki
Other
38 stars 11 forks source link

Tutorial example 1 only partial output: config.json and/or metacompass.iter0.ref.py issue? #5

Closed alexdthomas closed 5 years ago

alexdthomas commented 6 years ago

Hello,

I've been trying to check my installation of MetaCompass before running on my data and have it at least one (possibly two, possibly related) issues.

First, I installed the dependencies and made sure they are in my path. Then ran the tutorial one example.

python go_metacompass.py -r tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta -P tutorial/thao2000.1.fq,tutorial/thao2000.2.fq -o example1_output -m 1 -t 4

and got the following output

confirming file containing reference genomes exists..
[OK]
checking for dependencies (Bowtie2, Blast, kmermask, Snakemake, etc)
Bowtie2--->[OK]
/home/talex/.pyenv/versions/miniconda3-latest/bin/blastn
Blast+--->[OK]
/home/talex/apps/MetaCompass/bin/kmer-mask
kmer-mask--->[OK]
/home/talex/.pyenv/versions/miniconda3-latest/bin/snakemake
Snakemake--->[OK]
Traceback (most recent call last):
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/io.py", line 697, in _load_configfile
    return yaml.load(f)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/yaml/__init__.py", line 72, in load
    return loader.get_single_data()
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/yaml/constructor.py", line 35, in get_single_data
    node = self.get_single_node()
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/yaml/composer.py", line 127, in compose_mapping_node
    while not self.check_event(MappingEndEvent):
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/yaml/parser.py", line 550, in parse_flow_mapping_key
    "expected ',' or '}', but got %r" % token.id, token.start_mark)
yaml.parser.ParserError: while parsing a flow mapping
in "/home/talex/apps/MetaCompass/snakemake/config.json", line 1, column 1
expected ',' or '}', but got '<scalar>'
  in "/home/talex/apps/MetaCompass/snakemake/config.json", line 17, column 14
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/talex/.pyenv/versions/miniconda3-latest/bin/snakemake", line 6, in <module>
    sys.exit(snakemake.main())
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/__init__.py", line 1017, in main
    max_jobs_per_second=args.max_jobs_per_second)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/__init__.py", line 270, in snakemake
    overwrite_config.update(load_configfile(configfile))
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/io.py", line 708, in load_configfile
    config = _load_configfile(configpath)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/io.py", line 699, in _load_configfile
    raise WorkflowError("Config file is not valid JSON or YAML. "
snakemake.exceptions.WorkflowError: Config file is not valid JSON or YAML. In case of YAML, make sure to not mix whitespace and tab indentation.
ERROR: snakemake command failed; exiting..
touch: cannot touch 'example1_output/thao2000.0.assembly.out/run.fail': No such file or directory

I investigated the file /home/talex/apps/MetaCompass/snakemake/config.json and noticed what I think is incorrect JSON formatting (no expert, kind of just a guess)

{
    "sample" : "",
    "r1" : [""],
    "r2" : [""],
    "ru" : [""],
    "reads" : [""],
    "reference":[""],
    "pickref":"breadth",
    "length":100,
    "prefix":".",
    "memory": 50,
    "nthreads": 64,
    "iter": 1,
    "mincov" : 3,
    "minlen" : 100,
    "mfilter" : 0.00005,
    "cogcov" = 10,
    "mcdir": "."

}

I think "cogcov" = 10, should be "cogcov" : 10,, so I updated it and re-ran the example. I indeed did get a different error.

confirming file containing reference genomes exists..
[OK]
checking for dependencies (Bowtie2, Blast, kmermask, Snakemake, etc)
Bowtie2--->[OK]
/home/talex/.pyenv/versions/miniconda3-latest/bin/blastn
Blast+--->[OK]
/home/talex/apps/MetaCompass/bin/kmer-mask
kmer-mask--->[OK]
/home/talex/.pyenv/versions/miniconda3-latest/bin/snakemake
Snakemake--->[OK]
Full Traceback (most recent call last):
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/__init__.py", line 399, in snakemake
    no_hooks=no_hooks)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/workflow.py", line 308, in execute
    dag.init()
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 115, in init
    job = self.update([job])
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 436, in update
    raise exceptions[0]
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 408, in update
    skip_until_dynamic=skip_until_dynamic)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 468, in update_
    raise ex
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 462, in update_
    job.dynamic_input)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 436, in update
    raise exceptions[0]
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 408, in update
  skip_until_dynamic=skip_until_dynamic)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 468, in update_
    raise ex
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 462, in update_
    job.dynamic_input)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 436, in update
    raise exceptions[0]
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 408, in update
    skip_until_dynamic=skip_until_dynamic)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/dag.py", line 477, in update_
    raise MissingInputException(job.rule, missing_input)
snakemake.exceptions.MissingInputException: Missing input files for rule pilon_map:

MissingInputException in line 72 of /home/talex/apps/MetaCompass/snakemake/metacompass.iter0.ref.py:
Missing input files for rule pilon_map:

unlocking
removed all locks
ERROR: snakemake command failed; exiting..
touch: cannot touch 'example1_output/thao2000.0.assembly.out/run.fail': No such file or directory

looking at /home/talex/apps/MetaCompass/snakemake/metacompass.iter0.ref.py: I figured it could have something to do with bowtie2...

I found this issue https://github.com/marbl/MetaCompass/issues/3. So I decided to uninstall MetaCompass, install bowtie2 2.2.9, and create an alias in my .bash_profile to override the system version, then reinstalled MetaCompass. I got the exact same series of errors (before and after updating the config.json file.

here's some info on my dependency installations

pyenv versions
  system
  2.7.12
  2.7.9
  3.1
  3.5.1
  3.6.1
  miniconda2-latest
* miniconda3-latest (set by /home/talex/.pyenv/version)

blastn -version
blastn: 2.6.0+

snakemake --version
3.7.1

bowtie2 -version
Bowtie 2 version 2.2.9 by Ben Langmead (langmea@cs.jhu.edu, www.cs.jhu.edu/~langmea)

samtools --version
samtools 1.6

megahit -version
megahit: MEGAHIT v1.1.3

java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-0ubuntu0.16.04.1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

plenv versions
  system
* 5.18.0 (set by /home/talex/.plenv/version)

I've also tried installing the kmer-mask via the meryl installation from https://sourceforge.net/p/kmer/code/HEAD/tree/trunk/ and also adding using the path to the MetaCompass installation. Above errors identical. Any help getting this to work would be appreciated.

Thanks!

vcepeda commented 6 years ago

Hi Alex, I recommend you to run the stable releases 1.1 https://github.com/marbl/MetaCompass/releases/tag/1.1 The new release will be available next the week.

alexdthomas commented 6 years ago

Thanks you for your prompt reply! I downloaded and installed the stable version. Running python go_metacompass.py -r tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta -P tutorial/thao2000.1.fq,tutorial/thao2000.2.fq -o example1_output -m 1 -t 4

the output is

confirming file containing reference genomes exists..
[OK]
checking for dependencies (Bowtie2, Blast, kmermask, Snakemake, etc)
Bowtie2--->[OK]
/home/talex/.pyenv/versions/miniconda3-latest/bin/blastn
Blast+--->[OK]
/home/talex/apps/MetaCompass-1.1/bin/kmer-mask
kmer-mask--->[OK]
/home/talex/.pyenv/versions/miniconda3-latest/bin/snakemake
Snakemake--->[OK]
Provided cores: 4
Rules claiming more threads will be scaled down.
Job counts:
        count   jobs
        1       all
        1       assemble_unmapped
        1       bam_sort
        1       bowtie2_map
        1       build_contigs
        1       create_tsv
        1       join_contigs
        1       merge_reads
        1       pilon_contigs
        1       pilon_map
        1       sam_to_bam
        11
Resources before job selection: {'_cores': 4, '_nodes': 9223372036854775807}                                                                        [53/14267]
Ready jobs (1):
        merge_reads
Selected jobs (1):
        merge_reads
Resources after job selection: {'_cores': 3, '_nodes': 9223372036854775806}
---merge fastq reads
Releasing 1 _cores (now 4).
Releasing 1 _nodes (now 9223372036854775807).
1 of 11 steps (9%) done
Resources before job selection: {'_cores': 4, '_nodes': 9223372036854775807}
Ready jobs (1):
        bowtie2_map
Selected jobs (1):
        bowtie2_map
Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806}
---Build index .
Full Traceback (most recent call last):
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/executors.py", line 784, in run_wrapper
    version)
  File "/home/talex/apps/MetaCompass-1.1/snakemake/metacompass.iter0.ref.py", line 86, in __rule_bowtie2_map
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/shell.py", line 74, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'bowtie2-build -o 3 --threads 4 -q tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta example1_output/thao2000.0.a
ssembly.out/thao2000.index 1>> example1_output/thao2000.0.assembly.out/thao2000.index 2>&1;bowtie2 -a --end-to-end --sensitive --no-unal -p 4 -x example1_outp
ut/thao2000.0.assembly.out/thao2000.index -q -U example1_output/thao2000.merged.fq -S example1_output/thao2000.0.assembly.out/thao2000.sam.all > example1_outp
ut/thao2000.0.bowtie2map.log 2>&1; /home/talex/apps/MetaCompass-1.1/bin/best_strata.py example1_output/thao2000.0.assembly.out/thao2000.sam.all example1_outpu
t/thao2000.0.assembly.out/thao2000.sam; rm example1_output/thao2000.0.assembly.out/thao2000.sam.all' returned non-zero exit status 1

Error in job bowtie2_map while creating output files example1_output/thao2000.0.assembly.out/thao2000.index, example1_output/thao2000.0.assembly.out/thao2000.
index, example1_output/thao2000.0.assembly.out/thao2000.sam.
Full Traceback (most recent call last):
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/executors.py", line 784, in run_wrapper
    version)
  File "/home/talex/apps/MetaCompass-1.1/snakemake/metacompass.iter0.ref.py", line 86, in __rule_bowtie2_map
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/shell.py", line 74, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'bowtie2-build -o 3 --threads 4 -q tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta example1_output/thao2000.0.a
ssembly.out/thao2000.index 1>> example1_output/thao2000.0.assembly.out/thao2000.index 2>&1;bowtie2 -a --end-to-end --sensitive --no-unal -p 4 -x example1_outp
ut/thao2000.0.assembly.out/thao2000.index -q -U example1_output/thao2000.merged.fq -S example1_output/thao2000.0.assembly.out/thao2000.sam.all > example1_outp
ut/thao2000.0.bowtie2map.log 2>&1; /home/talex/apps/MetaCompass-1.1/bin/best_strata.py example1_output/thao2000.0.assembly.out/thao2000.sam.all example1_outpu
t/thao2000.0.assembly.out/thao2000.sam; rm example1_output/thao2000.0.assembly.out/thao2000.sam.all' returned non-zero exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/executors.py", line 247, in _callback
    raise ex
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/concurrent/futures/thread.py", line 55, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/site-packages/snakemake/executors.py", line 798, in run_wrapper
    show_traceback=True))
snakemake.exceptions.RuleException: CalledProcessError in line 50 of /home/talex/apps/MetaCompass-1.1/snakemake/metacompass.iter0.ref.py:
Command 'bowtie2-build -o 3 --threads 4 -q tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta example1_output/thao2000.0.assembly.out/thao2000.index 1>>
example1_output/thao2000.0.assembly.out/thao2000.index 2>&1;bowtie2 -a --end-to-end --sensitive --no-unal -p 4 -x example1_output/thao2000.0.assembly.out/thao
2000.index -q -U example1_output/thao2000.merged.fq -S example1_output/thao2000.0.assembly.out/thao2000.sam.all > example1_output/thao2000.0.bowtie2map.log 2>
&1; /home/talex/apps/MetaCompass-1.1/bin/best_strata.py example1_output/thao2000.0.assembly.out/thao2000.sam.all example1_output/thao2000.0.assembly.out/thao2
000.sam; rm example1_output/thao2000.0.assembly.out/thao2000.sam.all' returned non-zero exit status 1
  File "/home/talex/apps/MetaCompass-1.1/snakemake/metacompass.iter0.ref.py", line 50, in __rule_bowtie2_map
RuleException:
CalledProcessError in line 50 of /home/talex/apps/MetaCompass-1.1/snakemake/metacompass.iter0.ref.py:
Command 'bowtie2-build -o 3 --threads 4 -q tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta example1_output/thao2000.0.assembly.out/thao2000.index 1>>
example1_output/thao2000.0.assembly.out/thao2000.index 2>&1;bowtie2 -a --end-to-end --sensitive --no-unal -p 4 -x example1_output/thao2000.0.assembly.out/thao
2000.index -q -U example1_output/thao2000.merged.fq -S example1_output/thao2000.0.assembly.out/thao2000.sam.all > example1_output/thao2000.0.bowtie2map.log 2>
&1; /home/talex/apps/MetaCompass-1.1/bin/best_strata.py example1_output/thao2000.0.assembly.out/thao2000.sam.all example1_output/thao2000.0.assembly.out/thao2
000.sam; rm example1_output/thao2000.0.assembly.out/thao2000.sam.all' returned non-zero exit status 1
  File "/home/talex/apps/MetaCompass-1.1/snakemake/metacompass.iter0.ref.py", line 50, in __rule_bowtie2_map
  File "/home/talex/.pyenv/versions/miniconda3-latest/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job bowtie2_map since they might be corrupted:
example1_output/thao2000.0.assembly.out/thao2000.index, example1_output/thao2000.0.assembly.out/thao2000.index
Releasing 4 _cores (now 4).
Releasing 1 _nodes (now 9223372036854775807).
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
unlocking
removing lock
removing lock
removed all locks
ERROR: snakemake command failed; exiting..

the example_output dir contains Candidatus_Carsonella_ruddii_HT_Thao2000.fasta thao2000.0.assembly.out thao2000.fasta thao2000.marker.match.1.fastq thao2000.merged.fq

I suppose I should wait until next week or so?

alexdthomas commented 5 years ago

I'm trying to run the tutorial example with the new release (https://github.com/marbl/MetaCompass/releases/tag/paper-v1.0). After checking dependencies and installing I ran python go_metacompass.py -r tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta -P tutorial/thao2000.1.fq,tutorial/thao2000.2.fq -o example1_output -m 1 -t 4. I am no longer getting these errors, instead I am getting Issue #7. I'll mark this as closed