ArtPoon / kamphir

Phylogenetic inference using a tree-shape kernel in an Approximate Bayesian Computation framework
BSD 3-Clause "New" or "Revised" License
6 stars 2 forks source link

Kamphir won't run on cluster #39

Open rmcclosk opened 9 years ago

rmcclosk commented 9 years ago

Unable to dispatch a Kamphir run with bpsh. Command was

bpsh 8 python kamphir.py DiffRisk \
        settings.DiffRisk.json \
        bc-subtypeB.timetree.nwk \
        results/DiffRisk.log \
        -ncores 8 \
        -nthreads 8 \
        -nreps 20 \
        -kdecay 0.3 \
        -tol0 0.005 \
        -mintol 0.0025 \
        -treenum 0 \
        -seed 0 \
        -tau 2.0

Traceback:

Starting kamphir
./run_example.sh: line 15: 64745 Killed                  python kamphir.py DiffRisk settings.DiffRisk.json bc-subtypeB.timetree.nwk results/DiffRisk.log -ncores 8 -nthreads 8 -nreps 20 -kdecay 0.3 -tol0 0.005 -mintol 0.0025 -treenum 0 -seed 0 -tau 2.0
Process PoolWorker-2:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/local/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 99, in worker
    put((job, i, result))
  File "/usr/local/lib/python2.7/multiprocessing/queues.py", line 390, in put
    return send(obj)
IOError: [Errno 32] Broken pipe

The "Broken pipe" error was repeated 6 times for different PoolWorkers. Problem does not occur on the head note.

Possibly related: disk access from the nodes is extremely slow. Running a python script containing only import multiprocessing takes 12-20 seconds, vs. less than 1 second on the head node.