I'm running a series of experiments with scoop on a slurm cluster.
Tonight some of my tasks seem to have run out of memory:
Traceback (most recent call last):
File "/software/python/2.7.12/lib/python2.7/logging/__init__.py", line 872, in emit
Bad address (bundled/zeromq/src/tcp.cpp:244)
stream.write(ufs % msg)
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/codecs.py", line 706, in write
return self.writer.write(data)
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/codecs.py", line 370, in write
self.stream.write(data)
IOError: [Errno 12] Cannot allocate memory
...
Traceback (most recent call last):
File "/software/python/2.7.12/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/software/python/2.7.12/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
b.main()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
futures_startup()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
run_name="__main__"
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_control.py", line 231, in runController
future = execQueue.pop()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_types.py", line 320, in pop
self.updateQueue()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_types.py", line 343, in updateQueue
for future in self.socket.recvFuture():
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 279, in recvFuture
received = self._recv()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 188, in _recv
thisFuture = pickle.loads(msg[1])
IndexError: list index out of range
The main issue here is that it seems as if scoop did not completely terminate, but remains running in a locked up state (0 load) for hours.
I'm running a series of experiments with scoop on a slurm cluster.
Tonight some of my tasks seem to have run out of memory:
The main issue here is that it seems as if scoop did not completely terminate, but remains running in a locked up state (0 load) for hours.