Closed dvdylus closed 7 years ago
It's hard to say what is causing the error - which appears to be in pyparsing, rather than treeCl. What version of treeCl and pyparsing do you have? Are you running sequentially or in parallel? If in parallel, could it be the same alignment throws the error, but at a different point in execution because of the parallel dispatch?
Hey,
The gene tree calculation stops randomly after X trees. Everytime at a different number of finished trees and results in the following error message:
File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 3359, in parseImpl loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False ) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1514, in _parseCache value = self._parseNoCache(instring, loc, doActions, callPreParse) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1383, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 3698, in parseImpl return self.expr._parse( instring, loc, doActions, callPreParse=False ) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1514, in _parseCache value = self._parseNoCache(instring, loc, doActions, callPreParse) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1383, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 4077, in parseImpl expr_parse(instring, tmploc, doActions=False, callPreParse=False) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1514, in _parseCache value = self._parseNoCache(instring, loc, doActions, callPreParse) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 3430, in parseImpl loc2 = e.tryParse( instring, loc ) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1421, in tryParse return self._parse( instring, loc, doActions=False )[0] File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1517, in _parseCache cache.set(lookup, pe.class(*pe.args)) File "/home/ddylus/.pyenv/versions/env2.7.11/lib/python2.7/site-packages/pyparsing.py", line 1464, in set cache.popitem(False) File "/home/ddylus/.pyenv/versions/2.7.11/lib/python2.7/collections.py", line 166, in popitem value = self.pop(key) File "/home/ddylus/.pyenv/versions/2.7.11/lib/python2.7/collections.py", line 145, in pop del self[key] File "/home/ddylus/.pyenv/versions/2.7.11/lib/python2.7/collections.py", line 74, in delitem link_prev, linknext, = self.__map.pop(key) KeyError: (Re:('ML estimate base freqs[\d+]:'), 'IMPORTANT WARNING: Alignment column 941 contains only undetermined values which will be treated as missing data\n\n\nIMPORTANT WARNING: Sequences Haliaeetus_albicilla and Haliaeetus_leucocephalus are exactly identical\n\nIMPORTANT WARNING\nFound 1 sequence that is exactly identical to other sequences in the alignment.\nNormally they should be excluded from the analysis.\n\n\nIMPORTANT WARNING\nFound 1 column that contains only undetermined values which will be treated as missing data.\nNormally these columns should be excluded from the analysis.\n\n\nJust in case you might need it, a mixed model file with \nmodel assignments for undetermined columns removed is printed to file /tmp/tmpJNg6cK.reduced\nAn alignment file with undetermined columns and sequence duplicates removed has already\nbeen printed to file /scratch/beegfs/monthly/ddylus/barn_owl/treeCl_analysis_v2/min17/align/OG3536_17.phy.reduced\n\nAlignment has 1 completely undetermined sites that will be automatically removed from the input data\n\n\n\nThis is RAxML version 8.2.0 released by Alexandros Stamatakis on July 9 2015.\n\nWith greatly appreciated code contributions by:\nAndre Aberer (HITS)\nSimon Berger (HITS)\nAlexey Kozlov (HITS)\nKassian Kobert (HITS)\nDavid Dao (KIT and HITS)\nNick Pattengale (Sandia)\nWayne Pfeiffer (SDSC)\nAkifumi S. Tanabe (NRIFS)\n\nAlignment has 642 distinct alignment patterns\n\nProportion of gaps and completely undetermined characters in this alignment: 18.63%\n\nRAxML rapid hill-climbing mode\n\nUsing 1 distinct models/data partitions with joint branch length optimization\n\n\nExecuting 1 inferences on the original alignment using 1 distinct randomized MP trees\n\nAll free model parameters will be estimated by RAxML\nGAMMA model of rate heteorgeneity, ML estimate of alpha-parameter\n\nGAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units\n\nPartition: 0\nAlignment Patterns: 642\nName: OG3536_17\nDataType: AA\nSubstitution Matrix: WAG\nUsing ML estimate of base frequencies\n\n\n\n\nRAxML was called as follows:\n\n/software/Phylogeny/raxml/8.1.2/bin/raxmlHPC-PTHREADS -T 2 -m PROTGAMMAWAGX -n tmpZnSLBL -s /scratch/beegfs/monthly/ddylus/barn_owl/treeCl_analysis_v2/min17/align/OG3536_17.phy -p 8930 -O -w /tmp/tmpdc731B -q /tmp/tmpJNg6cK \n\n\nPartition: 0 with name: OG3536_17\nInitial base frequencies, prior to ML estimate: 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 0.050 \n\nInference[0]: Time 59.201094 GAMMA-based likelihood -11706.701713, best rearrangement setting 5\nalpha[0]: 0.718289 ML estimate base freqs[0]: 0.070645 0.058650 0.043437 0.058193 0.013148 0.034020 0.061266 0.076988 0.018539 0.050721 0.094833 0.063136 0.025173 0.048369 0.040978 0.073877 0.054362 0.009971 0.027564 0.076129 \n\n\nConducting final model optimizations on all 1 trees under GAMMA-based models ....\n\nInference[0] final GAMMA-based Likelihood: -11706.701713 tree written to file /tmp/tmpdc731B/RAxML_result.tmpZnSLBL\n\n\nStarting final GAMMA-based thorough Optimization on tree 0 likelihood -11706.701713 .... \n\nFinal GAMMA-based Score of best tree -11706.701713\n\nProgram execution info written to /tmp/tmpdc731B/RAxML_info.tmpZnSLBL\nBest-scoring ML tree written to: /tmp/tmpdc731B/RAxML_bestTree.tmpZnSLBL\n\nOverall execution time: 79.100595 secs or 0.021972 hours or 0.000916 days\n\n', 2953, True, False)