phockett / epsman

Management tools for ePolyScat job generation & execution.
1 stars 2 forks source link

Running and managing ePS remotely #17

Open phockett opened 3 years ago

phockett commented 3 years ago

Current implementation (tries) to run shell script via Fabric, but this has issues with detached/background processes sometimes (see, e.g., start a background process with nohup using fabric and ssh TTY behaviour notes.

Todo:

Also/to debug:

phockett commented 3 years ago

c7e91e4265619151c5f68f53eb966dadb7c54ec4 fixes ePS runner scripts and .runJobs() class method.

This implements runner as:

cmd = f" {self.hostDefn[self.host]['genFile'].as_posix()} &> /dev/null &"
result = self.c.run(Path(self.hostDefn[self.host]['scpdir'], 'ePS_batch_nohup.sh').as_posix() + cmd, 
                     warn = True, timeout = 10, pty=False)

For testing, set self.hostDefn[self.host]['genFile'] = Path(em.__path__[0]).parent/'notes'/'job-test.conf'.

Notes:

To do:

UPDATE March 2022: see 5bdeecc25fff6b42dc1f5a56db848994ac58eb5a

phockett commented 1 year ago

Testing manual Docker runs on Fock cluster via Slurm.

e.g. srun -w fock-25 -l docker run -d --rm -v /software/docker/ePS_dockerBuildTests_2022:/data --env NCPUS=20 --name eps25-080523 fock:5000/eps:v040523 /data/runTests-fock-23.sh /data/tests-Fock-25-Slurm_test_080523

TODO: