Closed EricR86 closed 8 years ago
Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).
From hoffman...@gmail.com on April 12, 2011 13:46:16
There is not currently a fool-proof way to tell that GMTK ran out of memory, which I need for other fixes like this. I am discussing this with the GMTK people--it looks like one might exist soon.
Summary: Segway should not requeue jobs that failed for reasons other than out-of-memory
Status: Accepted
Labels: -Priority-Medium Priority-Low
Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).
Resolved in Pull Request #36. Currently jobs are resubmitted once more if it is not a memory error.
Original report (BitBucket issue) by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).
From Google Code Issue #13
Imported Labels: bug, imported, Priority-Low
From jay.hesselberth on August 18, 2010 12:29:09
I migrated some segway training runs from one file system to another; this changed the root path and so the .inc files referenced by segway.inc and input.master changed. When I initiate an identify run, the gmtkTriangulate task raises and exception and exits the segway run cleanly:
/segway-projects/results/20100812/exp1/segway.str:1:70: error: /segway-results/results/20100812/exp1/auxiliary/segway.inc: No such file or directory Parse Error in file '/segway-projects/results/20100812/exp1/segway.str': expecting variable type at or before line 7, near (TYPE_SEGCOUNTDOWN) Exiting Program Traceback (most recent call last): File "/segway-build/arch/Linux-x86_64/bin/segway", line 8, in
load_entry_point('segway==0.2.0', 'console_scripts', 'segway')()
File "/segway-build/arch/Linux-x86_64/lib/python2.6/segway-0.2.0-py2.6.egg/segway/run.py", line 3694, in main
return runner()
File "/segway-build/arch/Linux-x86_64/lib/python2.6/segway-0.2.0-py2.6.egg/segway/run.py", line 3514, in call
self.run(*args, kwargs)
File "/segway-build/arch/Linux-x86_64/lib/python2.6/segway-0.2.0-py2.6.egg/segway/run.py", line 3487, in run
self.run_triangulate()
File "/segway-build/arch/Linux-x86_64/lib/python2.6/segway-0.2.0-py2.6.egg/segway/run.py", line 3084, in run_triangulate
self.run_triangulate_single(num_segs)
File "/segway-build/arch/Linux-x86_64/lib/python2.6/segway-0.2.0-py2.6.egg/segway/run.py", line 3077, in run_triangulate_single
prog(kwargs)
File "build/bdist.linux-x86_64/egg/optbuild.py", line 78, in call
File "build/bdist.linux-x86_64/egg/optbuild.py", line 209, in run
File "build/bdist.linux-x86_64/egg/optbuild.py", line 191, in _getoutput
File "build/bdist.linux-x86_64/egg/optbuild.py", line 154, in _popen
File "build/bdist.linux-x86_64/egg/optbuild.py", line 54, in _returncode_error_factory
optbuild.ReturncodeError: /segway-build/arch/Linux-x86_64/bin/gmtkTriangulate returned 1
However, the gmtkViterbi task that uses input.master recognizes the error and raises and exception, but then (apparently) thinks it exited because it ran out of memory and launches another job with increased memory requirements (below; there are multiple of these errors in a row as it keeps trying to launch the same job repeatedly). Rather than doing this, it should exit cleanly as above because the model isn't correctly parsed.
root@ip-10-204-151-58 /segway-projects/results/20100818/exp1/20100818.identify.20100818/output/e/identify
Original issue: http://code.google.com/p/segway-genome/issues/detail?id=13