iqbal-lab-org / make_prg

Code to create a PRG from a Multiple Sequence Alignment file
Other
21 stars 7 forks source link

AssertionError: Each sequence should be in a cluster #56

Closed Danderson123 closed 1 year ago

Danderson123 commented 1 year ago

Hey @leoisl! I am running make_prg v0.4.0 on gene alignments for ~1400 E.coli and have consistently run into the bug below when make_prg is processing the attached alignment.

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.9.5-jtayjftvmku5dcg53v74ilyhipv6kvxi/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.9.5-jtayjftvmku5dcg53v74ilyhipv6kvxi/lib/python3.9/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/subcommands/from_msa.py", line 114, in process_MSA
    builder = prg_builder.PrgBuilder(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/prg_builder.py", line 42, in __init__
    self.root: RecursiveTreeNode = NodeFactory.build(alignment, self, None)
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 445, in build
    return MultiIntervalNode(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 189, in __init__
    super().__init__(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 53, in __init__
    self._children: List["RecursiveTreeNode"] = self._get_children(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 130, in _get_children
    return [
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 131, in <listcomp>
    NodeFactory.build(alignment, self.prg_builder, self)
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 463, in build
    return MultiClusterNode(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 217, in __init__
    super().__init__(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 53, in __init__
    self._children: List["RecursiveTreeNode"] = self._get_children(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 130, in _get_children
    return [
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 131, in <listcomp>
    NodeFactory.build(alignment, self.prg_builder, self)
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 463, in build
    return MultiClusterNode(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 217, in __init__
    super().__init__(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 53, in __init__
    self._children: List["RecursiveTreeNode"] = self._get_children(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 130, in _get_children
    return [
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 131, in <listcomp>
    NodeFactory.build(alignment, self.prg_builder, self)
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 463, in build
    return MultiClusterNode(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 217, in __init__
    super().__init__(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 53, in __init__
    self._children: List["RecursiveTreeNode"] = self._get_children(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 130, in _get_children
    return [
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 131, in <listcomp>
    NodeFactory.build(alignment, self.prg_builder, self)
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 463, in build
    return MultiClusterNode(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 217, in __init__
    super().__init__(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 53, in __init__
    self._children: List["RecursiveTreeNode"] = self._get_children(
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 130, in _get_children
    return [
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 131, in <listcomp>
    NodeFactory.build(alignment, self.prg_builder, self)
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/recursion_tree.py", line 453, in build
    clustering_result = kmeans_cluster_seqs(alignment, min_match_length)
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/from_msa/cluster_sequences.py", line 291, in kmeans_cluster_seqs
    assert len(alignment) == sum(
AssertionError: Each input sequence should be in a cluster
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/bin/make_prg", line 8, in <module>
    sys.exit(main())
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/__main__.py", line 94, in main
    args.func(args)
  File "/hps/nobackup/iqbal/dander/amira_panRG/venv/lib/python3.9/site-packages/make_prg/subcommands/from_msa.py", line 183, in run
    pool.starmap(process_MSA, args, chunksize=1)
  File "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.9.5-jtayjftvmku5dcg53v74ilyhipv6kvxi/lib/python3.9/multiprocessing/pool.py", line 372, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.9.5-jtayjftvmku5dcg53v74ilyhipv6kvxi/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
AssertionError: Each input sequence should be in a cluster
Traceback (most recent call last):
  File "/hps/nobackup/iqbal/dander/amira_panRG/panaroo_qc_panRGs/make_panRG_from_panaroo_qced.py", line 99, in <module>
    subprocess.run("make_prg from_msa -t 64 -i " + alignment_path + " -o horesh.card.panRG.qc.cov.mode.0.card.included", shell=True, check=True)
  File "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.9.5-jtayjftvmku5dcg53v74ilyhipv6kvxi/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'make_prg from_msa -t 64 -i mmseq2_out_cov_mode_0_card_supplemented_aligned -o horesh.card.panRG.qc.cov.mode.0.card.included' returned non-zero exit status 1.

evgS.aln.fas.gz

Danderson123 commented 1 year ago

I was not able to reproduce this when I tried to run again so will close this and reopen a new issue if I encounter it again.