vlasmirnov / MAGUS

Graph Clustering Merger
MIT License
32 stars 13 forks source link

StopIteration error #15

Open joelnitta opened 2 years ago

joelnitta commented 2 years ago

I am trying to run the example python3 ../magus.py -d outputs -i unaligned_sequences.txt -o magus_result.txt and getting this error:

(magus-env) root@39db7c458bba:/wd/MAGUS/example# python3 ../magus.py -d outputs -i unaligned_sequences.txt -o magus_result.txt
MAGUS was run with: ../magus.py -d outputs -i unaligned_sequences.txt -o magus_result.txt
Running a task, output file: /wd/MAGUS/example/magus_result.txt
Aligning sequences /wd/MAGUS/example/unaligned_sequences.txt
Read 1000 sequences from /wd/MAGUS/example/unaligned_sequences.txt ..
Building PASTA-style FastTree initial tree on /wd/MAGUS/example/unaligned_sequences.txt with skeleton size 300..
Running a task, output file: /wd/MAGUS/example/outputs/decomposition/initial_tree/initial_align.txt
Running an external tool, command: /miniconda3/envs/magus-env/bin/mafft --localpair --maxiterate 1000 --ep 0.123 --quiet --thread 128 --anysymbol /wd/MAGUS/example/outputs/decomposition/initial_tree/skeleton_sequences.txt > /wd/MAGUS/example/outputs/decomposition/initial_tree/temp_initial_align.txt
Completed a task, output file: /wd/MAGUS/example/outputs/decomposition/initial_tree/initial_align.txt
Running a task, output file: /wd/MAGUS/example/outputs/decomposition/initial_tree/skeleton_hmm/hmm_model.txt
Running an external tool, command: /miniconda3/envs/magus-env/bin/hmmbuild --ere 0.59 --cpu 1 --symfrac 0.0 --informat afa /wd/MAGUS/example/outputs/decomposition/initial_tree/skeleton_hmm/temp_hmm_model.txt /wd/MAGUS/example/outputs/decomposition/initial_tree/initial_align.txt
Completed a task, output file: /wd/MAGUS/example/outputs/decomposition/initial_tree/skeleton_hmm/hmm_model.txt
Read 700 sequences from /wd/MAGUS/example/outputs/decomposition/initial_tree/queries.txt ..
Running a task, output file: /wd/MAGUS/example/outputs/decomposition/initial_tree/chunks_queries/queries_chunk_1_aligned.txt
Running an external tool, command: /miniconda3/envs/magus-env/bin/hmmalign -o /wd/MAGUS/example/outputs/decomposition/initial_tree/chunks_queries/temp_queries_chunk_1_aligned.txt /wd/MAGUS/example/outputs/decomposition/initial_tree/skeleton_hmm/hmm_model.txt /wd/MAGUS/example/outputs/decomposition/initial_tree/chunks_queries/queries_chunk_1.txt
Completed a task, output file: /wd/MAGUS/example/outputs/decomposition/initial_tree/chunks_queries/queries_chunk_1_aligned.txt
Read 1000 sequences from /wd/MAGUS/example/outputs/decomposition/initial_tree/initial_align.txt ..
Found 100% ACGT-N, assuming DNA..
Data type wasn't specified. Inferred data type DNA from /wd/MAGUS/example/outputs/decomposition/initial_tree/initial_align.txt
Running a task, output file: /wd/MAGUS/example/outputs/decomposition/initial_tree/initial_tree.tre
Running an external tool, command: /miniconda3/envs/magus-env/bin/fasttree -nt -gtr -fastest -nosupport /wd/MAGUS/example/outputs/decomposition/initial_tree/initial_align.txt > /wd/MAGUS/example/outputs/decomposition/initial_tree/temp_initial_tree.tre
Completed a task, output file: /wd/MAGUS/example/outputs/decomposition/initial_tree/initial_tree.tre
Built initial tree on /wd/MAGUS/example/unaligned_sequences.txt in 183.0174605846405 sec..
Using target subset size of 50, and maximum number of subsets 25..
Read 1000 sequences from /wd/MAGUS/example/unaligned_sequences.txt ..
Task for /wd/MAGUS/example/magus_result.txt threw an exception:
generator raised StopIteration
Traceback (most recent call last):
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/dataio/newickreader.py", line 306, in tree_iter
    raise StopIteration
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/wd/MAGUS/tasks/task.py", line 59, in run
    func(**self.taskArgs)
  File "/wd/MAGUS/align/aligner.py", line 45, in runAlignmentTask
    decomposeSequences(context)
  File "/wd/MAGUS/align/decompose/decomposer.py", line 44, in decomposeSequences
    buildDecomposition(context, subsetsDir)
  File "/wd/MAGUS/align/decompose/decomposer.py", line 66, in buildDecomposition
    context.subsetPaths = treeutils.decomposeGuideTree(subsetsDir, context.sequencesPath, guideTreePath,
  File "/wd/MAGUS/helpers/treeutils.py", line 96, in decomposeGuideTree
    guideTree = dendropy.Tree.get(path=guideTreePath, schema="newick", preserve_underscores=True)
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/datamodel/treemodel.py", line 2732, in get
    return cls._get_from(**kwargs)
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/datamodel/basemodel.py", line 155, in _get_from
    return cls.get_from_path(src=src, schema=schema, **kwargs)
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/datamodel/basemodel.py", line 216, in get_from_path
    return cls._parse_and_create_from_stream(stream=fsrc,
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/datamodel/treemodel.py", line 2633, in _parse_and_create_from_stream
    tree_lists = reader.read_tree_lists(
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/dataio/ioservice.py", line 357, in read_tree_lists
    product = self._read(stream=stream,
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/dataio/newickreader.py", line 322, in _read
    for tree in self.tree_iter(stream=stream,
RuntimeError: generator raised StopIteration

MAGUS aborted with an exception..
Task manager found a failed task: /wd/MAGUS/example/magus_result.txt
Traceback (most recent call last):
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/dataio/newickreader.py", line 306, in tree_iter
    raise StopIteration
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "../magus.py", line 29, in main
    mainAlignmentTask()
  File "/wd/MAGUS/align/aligner.py", line 30, in mainAlignmentTask
    task.awaitTask()
  File "/wd/MAGUS/tasks/task.py", line 47, in awaitTask
    awaitTasks([self])
  File "/wd/MAGUS/tasks/task.py", line 94, in awaitTasks
    controller.awaitTasks(tasks)
  File "/wd/MAGUS/tasks/controller.py", line 34, in awaitTasks
    observeTaskManager()
  File "/wd/MAGUS/tasks/controller.py", line 53, in observeTaskManager
    runTask(task)
  File "/wd/MAGUS/tasks/manager.py", line 219, in runTask
    task.run()
  File "/wd/MAGUS/tasks/task.py", line 59, in run
    func(**self.taskArgs)
  File "/wd/MAGUS/align/aligner.py", line 45, in runAlignmentTask
    decomposeSequences(context)
  File "/wd/MAGUS/align/decompose/decomposer.py", line 44, in decomposeSequences
    buildDecomposition(context, subsetsDir)
  File "/wd/MAGUS/align/decompose/decomposer.py", line 66, in buildDecomposition
    context.subsetPaths = treeutils.decomposeGuideTree(subsetsDir, context.sequencesPath, guideTreePath,
  File "/wd/MAGUS/helpers/treeutils.py", line 96, in decomposeGuideTree
    guideTree = dendropy.Tree.get(path=guideTreePath, schema="newick", preserve_underscores=True)
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/datamodel/treemodel.py", line 2732, in get
    return cls._get_from(**kwargs)
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/datamodel/basemodel.py", line 155, in _get_from
    return cls.get_from_path(src=src, schema=schema, **kwargs)
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/datamodel/basemodel.py", line 216, in get_from_path
    return cls._parse_and_create_from_stream(stream=fsrc,
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/datamodel/treemodel.py", line 2633, in _parse_and_create_from_stream
    tree_lists = reader.read_tree_lists(
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/dataio/ioservice.py", line 357, in read_tree_lists
    product = self._read(stream=stream,
  File "/miniconda3/envs/magus-env/lib/python3.8/site-packages/dendropy/dataio/newickreader.py", line 322, in _read
    for tree in self.tree_iter(stream=stream,
RuntimeError: generator raised StopIteration

Waiting for 0 tasks to finish..
MAGUS finished in 183.12643766403198 seconds..

I had some trouble with dependencies and python versions, so I am running MAGUS in a conda environment, which is specified with the following environment.yml:

name: magus-env
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - python=3.8.0
  - dendropy=4.2.0
  - clustalo=1.2.4
  - mafft=7.490
  - mcl=14.137
  - hmmer=3.3.2
  - fasttree=2.1.10
  - raxml-ng=1.0.3

I also modified the paths in configuration.py as follows:

clustalPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/miniconda3/envs/magus-env/bin/clustalo")
    mafftPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/miniconda3/envs/magus-env/bin/mafft")
    mclPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/miniconda3/envs/magus-env/bin/mcl")
    mlrmclPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "tools/mlrmcl/mlrmcl")
    hmmalignPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/miniconda3/envs/magus-env/bin/hmmalign")
    hmmbuildPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/miniconda3/envs/magus-env/bin/hmmbuild")
    hmmsearchPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/miniconda3/envs/magus-env/bin/hmmsearch")
    fasttreePath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/miniconda3/envs/magus-env/bin/fasttree")
    raxmlPath = os.path.join(os.path.dirname(os.path.abspath(__file__)), "/miniconda3/envs/magus-env/bin/raxml-ng")

(I couldn't find a package for mlrmcl, but that doesn't seem to have anything to do with the error, as far as I can tell).

vlasmirnov commented 2 years ago

Hey, thanks a lot for reaching out. There's two things that come to mind - the first is that the decomposition tree file (decomposition\initial_tree\initial_tree.tre) has been somehow malformed. Since you seem to be using the same version of FastTree, this is probably not the case, but could I take a look at it, just to be sure?

The second possibility is different versions of dendropy reading newick files differently. Looks like youre using 4.2, mine is 4.4. It's possible they changed something between the two.

joelnitta commented 2 years ago

Thanks for the quick reply! When I upgraded dendropy to the latest version (4.5.2), it worked. So that seems to be the problem, not the decomposition tree file. Let me know if you still want to see the tree file though.

BTW I think it would help others use your code if you did one or more of the following:

I know that you bundled some of the dependencies, but I had the same problem as #12 when I tried to use e.g., mafft bundled with MAGUS.

Just my 2 cents.

vlasmirnov commented 2 years ago

Awesome, thanks a lot for letting me know. Thanks for the tips as well - I'm definitely setting aside some time this year to containerize MAGUS and
make this repo a bit more presentable than it currently is.

joelnitta commented 2 years ago

Sounds good!

Feel free to close this since it solved the problem I posted, or if you want to leave it open as a reminder for the suggestions, that's fine too.