saga-project / BigJob

SAGA-based Pilot-Job Implementation for Compute and Data
http://saga-project.github.com/BigJob/
Other
19 stars 8 forks source link

BigJob Doesn't Quit when it fails on agent directory #47

Closed melrom closed 11 years ago

melrom commented 11 years ago

10/25/2012 10:15:14 AM - bigjob - DEBUG - BigJob working directory: ssh://melrom@lonestar.tacc.utexas.edu/N/u/melrom/agent/bj-c6478f36-1eb6-11e2-8eb7-a4badb0c3696 10/25/2012 10:15:15 AM - bigjob - DEBUG - Directory not found: /N/u/melrom/agent/bj-c6478f36-1eb6-11e2-8eb7-a4badb0c3696 10/25/2012 10:15:15 AM - bigjob - DEBUG - Create directory at: lonestar.tacc.utexas.edu 10/25/2012 10:15:16 AM - bigjob - ERROR - Error creating directory: /N/u/melrom/agent/bj-c6478f36-1eb6-11e2-8eb7-a4badb0c3696 at: lonestar.tacc.utexas.edu

If agent directory is not created, script goes into queue and reaches Error qw (Eqw) state. This causes the script to run in an infinite loop. BigJob should quit after a failure to access the agent directory and notify the user that they have entered an invalid agent directory.

oleweidner commented 11 years ago

This has been implemented in saga-python-integration. However, it currently does a 'hard' sys.exit -- it should really throw an exception.