radical-cybertools / radical.saga

A Light-Weight Access Layer for Distributed Computing Infrastructure and Reference Implementation of the SAGA Python Language Bindings.
http://radical-cybertools.github.io/saga-python/
Other
82 stars 34 forks source link

SAGA does not submit correct scripts to Supermic #742

Closed iparask closed 5 years ago

iparask commented 5 years ago

I tried to run on Supermic and I get this error:

    jc.run()
  File "/home/iparask/test_rp/lib/python3.5/site-packages/radical/saga/task.py", line 411, in run
    raise se.NoSuccess ("future exception: %s" % (future.exception))
radical.saga.exceptions.NoSuccess: future exception: Error running job via 'qsub': 500774.smic3
** Job deleted, use of :ppn=20 or equivalent required on smic nodes resource to reserve entire node **
. Commandline was:
        SCRIPTFILE=`mktemp -t rs.jobscript.XXXXXX` \
            &&  echo "
#!/bin/bash
#PBS -N pilot.0000
#PBS -V
#PBS -v RADICAL_PROFILE=TRUE
#PBS -o /work/iparaske/radical.pilot.sandbox/rp.session.js-169-248.jetstream-cloud.org.iparask.018184.0003/pilot.0000/bootstrap_0.out
#PBS -e /work/iparaske/radical.pilot.sandbox/rp.session.js-169-248.jetstream-cloud.org.iparask.018184.0003/pilot.0000/bootstrap_0.err
#PBS -l walltime=0.5:30:00
#PBS -q workq
#PBS -A TG-MCB090174
#PBS -l nodes=4
export    PBS_O_WORKDIR=/work/iparaske/radical.pilot.sandbox/rp.session.js-169-248.jetstream-cloud.org.iparask.018184.0003/pilot.0000
mkdir -p  /work/iparaske/radical.pilot.sandbox/rp.session.js-169-248.jetstream-cloud.org.iparask.018184.0003/pilot.0000

One issue is not using :ppn=20, the other is the wall time #PBS -l walltime=0.5:30:00 which should have been #PBS -l walltime=0:30:00

The second is a division conversion from python 2 to python 3. Divisions in python3 are floating point, while in python 2 are integer. I think we need to check all divisions to make sure that there is no error.

iparask commented 5 years ago

Hello @andre-merzky, any ideas how to fix this?

iparask commented 5 years ago

Fixed partly and Supermic will not be supported any more