Closed melrom closed 11 years ago
It works on Stampede as well:
03/28/2013 02:42:48 PM - bigjob - INFO - Loading BigJob version: 0.4.128 on login1.stampede.tacc.utexas.edu 03/28/2013 02:42:53 PM - bigjob - INFO - Using SAGA Bliss. Start Pilot Job/BigJob at: fork://localhost 03/28/2013 02:42:53 PM - bigjob - DEBUG - Utilizing Redis Backend 03/28/2013 02:42:53 PM - bigjob - DEBUG - Parsing URL: redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - redis:// redis01.tacc.utexas.edu 6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - Connect to Redis: redis01.tacc.utexas.edu Port: 6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - init BigJob w/: redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - initialized BigJob: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:55 PM - bigjob - DEBUG - create pilot job entry on backend server: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:42:55 PM - bigjob - DEBUG - update state of pilot job to: Unknown stopped: False 03/28/2013 02:42:55 PM - bigjob - DEBUG - update description of pilot job to: None 03/28/2013 02:42:55 PM - bigjob - DEBUG - set pilot state to: Unknown 03/28/2013 02:42:55 PM - bigjob - DEBUG - setting walltime to: 10 03/28/2013 02:42:55 PM - bigjob - DEBUG - Use SSH backend 03/28/2013 02:42:56 PM - bigjob - DEBUG - ['/home1/01131/tg804093/src/BigJob/pilot/filemanagement/../../../webhdfs-py/', '/home1/01131/tg804093/src/BigJob/examples/../', '/home1/01131/tg804093/src/BigJob/examples', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/distribute-0.6.34-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pexpect-2.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/simplejson-2.0.9-py2.7-linux-x86_64.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/boto-2.2.2-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/globusonline_transfer_api_client-0.10.13-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/google_api_python_client-1.0-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/bliss-0.2.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/redis-2.2.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/virtualenv-1.8.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/threadpool-1.2.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/uuid-1.30-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/python_gflags-2.0-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/httplib2-0.7.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/paramiko_on_pypi-1.7.6-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pycrypto_on_pypi-2.3-py2.7-linux-x86_64.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/python_hostlist-1.14-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/BigJob-0.4.128-py2.7.egg', '/opt/apps/python/epd/7.3.2/modules/lib/python', '/opt/apps/python/epd/7.3.2/lib', '/home1/01131/tg804093/.bigjob/python/lib/python27.zip', '/home1/01131/tg804093/.bigjob/python/lib/python2.7', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/plat-linux2', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-tk', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-old', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-dynload', '/opt/apps/python/epd/7.3.2/lib/python2.7', '/opt/apps/python/epd/7.3.2/lib/python2.7/plat-linux2', '/opt/apps/python/epd/7.3.2/lib/python2.7/lib-tk', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages', '/opt/apps/python/epd/7.3.2/lib/python2.7/site-packages', '/opt/apps/python/epd/7.3.2/lib/python2.7/site-packages/PIL', '/home1/01131/tg804093/src/BigJob/examples/../bigjob', '/home1/01131/tg804093/src/BigJob/examples/../pilot/impl/../..', '/home1/01131/tg804093/src/BigJob/examples/../pilot/filemanagement/../..'] 03/28/2013 02:42:56 PM - bigjob - WARNING - WebHDFS package not found. 03/28/2013 02:42:56 PM - bigjob - DEBUG - Security Context: None 03/28/2013 02:42:56 PM - bigjob - DEBUG - BigJob working directory: ssh://localhost//tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:56 PM - bigjob - DEBUG - Create directory: //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:56 PM - bigjob - DEBUG - ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o NumberOfPasswordPrompts=0 localhost mkdir //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:57 PM - bigjob - DEBUG - Run ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o NumberOfPasswordPrompts=0 localhost mkdir //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f Output: ["Warning: Permanently added 'localhost' (RSA) to the list of known hosts.\r\r\n"] 03/28/2013 02:42:57 PM - bigjob - WARNING - No file staging adaptor found. 03/28/2013 02:42:57 PM - bigjob - DEBUG - BJ Working Directory: /tmp 03/28/2013 02:42:57 PM - bigjob - DEBUG - Adaptor specific modifications: fork 03/28/2013 02:42:57 PM - bigjob - DEBUG - Escape Bliss 03/28/2013 02:42:57 PM - bigjob - DEBUG - "import sys import os import urllib import sys import time start_time = time.time() home = os.environ.get(\"HOME\")
if home==None: home = os.getcwd() BIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\") if not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR) BIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\" if not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR) BOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\" BOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\"/bigjob-bootstrap.py\"
sys.path.insert(0, os.getcwd() + \"/../\") p = list() for i in sys.path: if i.find(\".bigjob/python\")>1: p.insert(0, i) for i in p: sys.path.insert(0, i) print \"Python path: \" + str(sys.path) print \"Python version: \" + str(sys.version_info) try: import saga except: print \"SAGA and SAGA Python Bindings not found.\"; try: import bigjob.bigjob_agent except: print \"BigJob not installed. Attempt to install it.\"; opener = urllib.FancyURLopener({}); opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR os.system(\"/usr/bin/env\") try: os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); execfile(activate_this, dict(file=activate_this)) except: print \"BJ installation failed. Trying system-level python (/usr/bin/python)\"; os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); execfile(activate_this, dict(file=activate_this))
import bigjob.bigjob_agent
args = list() args.append(\"bigjob_agent.py\") args.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\") args.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\") args.append(\"\") print \"Bootstrap time: \" + str(time.time()-start_time) print \"Starting BigJob Agents with following args: \" + str(args) bigjob_agent = bigjob.bigjob_agent.bigjob_agent(args) " 03/28/2013 02:42:57 PM - bigjob - DEBUG - Working directory: /tmp Job Description: {'Executable' : '/usr/bin/env','WorkingDirectory' : '/tmp','SPMDVariation' : 'single','Queue' : 'normal','WallTimeLimit' : '10','Arguments' : '['python', '-c', '"import sys\nimport os\nimport urllib\nimport sys\nimport time\nstart_time = time.time()\nhome = os.environ.get(\"HOME\")\n#print \"Home: \" + home\nif home==None: home = os.getcwd()\nBIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\")\nif not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR)\nBIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\"\nif not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR)\nBOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\\"\nBOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\\"/bigjob-bootstrap.py\\"\n#ensure that BJ in .bigjob is upfront in sys.path\nsys.path.insert(0, os.getcwd() + \"/../\")\np = list()\nfor i in sys.path:\n if i.find(\".bigjob/python\")>1:\n p.insert(0, i)\nfor i in p: sys.path.insert(0, i)\nprint \"Python path: \" + str(sys.path)\nprint \"Python version: \" + str(sys.version_info)\ntry: import saga\nexcept: print \"SAGA and SAGA Python Bindings not found.\";\ntry: import bigjob.bigjob_agent\nexcept: \n print \"BigJob not installed. Attempt to install it.\"; \n opener = urllib.FancyURLopener({}); \n opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); \n print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR\n os.system(\"/usr/bin/env\")\n try:\n os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n except:\n print \"BJ installation failed. Trying system-level python (/usr/bin/python)\";\n os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n#try to import BJ once again\nimport bigjob.bigjob_agent\n# execute bj agent\nargs = list()\nargs.append(\"bigjob_agent.py\")\nargs.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\")\nargs.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\")\nargs.append(\"\")\nprint \"Bootstrap time: \" + str(time.time()-start_time)\nprint \"Starting BigJob Agents with following args: \" + str(args)\nbigjob_agent = bigjob.bigjob_agent.bigjob_agent(args)\n"']','TotalCPUCount' : '8',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - Creating pilot job with description: {'Executable' : '/usr/bin/env','WorkingDirectory' : '/tmp','SPMDVariation' : 'single','Queue' : 'normal','WallTimeLimit' : '10','Arguments' : '['python', '-c', '"import sys\nimport os\nimport urllib\nimport sys\nimport time\nstart_time = time.time()\nhome = os.environ.get(\"HOME\")\n#print \"Home: \" + home\nif home==None: home = os.getcwd()\nBIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\")\nif not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR)\nBIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\"\nif not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR)\nBOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\\"\nBOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\\"/bigjob-bootstrap.py\\"\n#ensure that BJ in .bigjob is upfront in sys.path\nsys.path.insert(0, os.getcwd() + \"/../\")\np = list()\nfor i in sys.path:\n if i.find(\".bigjob/python\")>1:\n p.insert(0, i)\nfor i in p: sys.path.insert(0, i)\nprint \"Python path: \" + str(sys.path)\nprint \"Python version: \" + str(sys.version_info)\ntry: import saga\nexcept: print \"SAGA and SAGA Python Bindings not found.\";\ntry: import bigjob.bigjob_agent\nexcept: \n print \"BigJob not installed. Attempt to install it.\"; \n opener = urllib.FancyURLopener({}); \n opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); \n print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR\n os.system(\"/usr/bin/env\")\n try:\n os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n except:\n print \"BJ installation failed. Trying system-level python (/usr/bin/python)\";\n os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n#try to import BJ once again\nimport bigjob.bigjob_agent\n# execute bj agent\nargs = list()\nargs.append(\"bigjob_agent.py\")\nargs.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\")\nargs.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\")\nargs.append(\"\")\nprint \"Bootstrap time: \" + str(time.time()-start_time)\nprint \"Starting BigJob Agents with following args: \" + str(args)\nbigjob_agent = bigjob.bigjob_agent.bigjob_agent(args)\n"']','Error' : '/tmp/stderr-bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f-agent.txt','Output' : '/tmp/stdout-bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f-agent.txt','TotalCPUCount' : '8',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - Submit pilot job to: fork://localhost Pilot Job/BigJob URL: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost State: Unknown 03/28/2013 02:42:57 PM - bigjob - DEBUG - add subjob to queue of PJ: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:42:57 PM - bigjob - DEBUG - create dictionary for job description. Job-URL: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:57 PM - bigjob - DEBUG - SJ Attributes: {'Executable' : '/bin/echo','Environment' : '['HELLOWORLD=hello_world']','Arguments' : '['$HELLOWORLD']','NumberOfProcesses' : '1','Error' : 'stderr.txt','Output' : 'stdout.txt',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - job dict: {'Executable': '/bin/echo', 'NumberOfProcesses': 1, 'Environment': ['HELLOWORLD=hello_world'], 'state': 'Unknown', 'Arguments': ['$HELLOWORLD'], 'Error': 'stderr.txt', 'Output': 'stdout.txt', 'job-id': 'sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f'} 03/28/2013 02:42:57 PM - bigjob - DEBUG - set job state to: Unknown 03/28/2013 02:42:57 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Unknown 03/28/2013 02:42:59 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Unknown 03/28/2013 02:43:01 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:03 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:05 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:07 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Done 03/28/2013 02:43:07 PM - bigjob - DEBUG - Cancel Pilot Job 03/28/2013 02:43:07 PM - bigjob - DEBUG - stop pilot job: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:43:07 PM - bigjob - DEBUG - update state of pilot job to: Done stopped: True 03/28/2013 02:43:07 PM - bigjob - DEBUG - delete pilot job: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:43:07 PM - bigjob - DEBUG - Cancel Pilot Job finished
On Wed, Mar 27, 2013 at 2:17 PM, Melissa notifications@github.com wrote:
This is more of a reminder to myself...
BigJob is not working with the new redis server at TACC, but it is ping-able and works within SAGA. This COULD be an issue with the parsing happening in SAGA URL, but I have to diagnose what is really going on.
If anyone who works these tickets would like to help me diagnose, please ping me for the new password.
— Reply to this email directly or view it on GitHub.
Ok, now I see the issue:
the Redis server is not reachable from the Stampede compute nodes:
(python)tg804093@c402-002.stampede ~$ telnet redis01.tacc.utexas.edu 6379 Trying 129.114.60.146...
@Yaakoub: Can you have a look at this?
Thanks, Andre
On Thu, Mar 28, 2013 at 8:45 PM, Andre Luckow andre.luckow@gmail.com wrote:
It works on Stampede as well:
03/28/2013 02:42:48 PM - bigjob - INFO - Loading BigJob version: 0.4.128 on login1.stampede.tacc.utexas.edu 03/28/2013 02:42:53 PM - bigjob - INFO - Using SAGA Bliss. Start Pilot Job/BigJob at: fork://localhost 03/28/2013 02:42:53 PM - bigjob - DEBUG - Utilizing Redis Backend 03/28/2013 02:42:53 PM - bigjob - DEBUG - Parsing URL: redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - redis:// redis01.tacc.utexas.edu 6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - Connect to Redis: redis01.tacc.utexas.edu Port: 6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - init BigJob w/: redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - initialized BigJob: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:55 PM - bigjob - DEBUG - create pilot job entry on backend server: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:42:55 PM - bigjob - DEBUG - update state of pilot job to: Unknown stopped: False 03/28/2013 02:42:55 PM - bigjob - DEBUG - update description of pilot job to: None 03/28/2013 02:42:55 PM - bigjob - DEBUG - set pilot state to: Unknown 03/28/2013 02:42:55 PM - bigjob - DEBUG - setting walltime to: 10 03/28/2013 02:42:55 PM - bigjob - DEBUG - Use SSH backend 03/28/2013 02:42:56 PM - bigjob - DEBUG - ['/home1/01131/tg804093/src/BigJob/pilot/filemanagement/../../../webhdfs-py/', '/home1/01131/tg804093/src/BigJob/examples/../', '/home1/01131/tg804093/src/BigJob/examples', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/distribute-0.6.34-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pexpect-2.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/simplejson-2.0.9-py2.7-linux-x86_64.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/boto-2.2.2-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/globusonline_transfer_api_client-0.10.13-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/google_api_python_client-1.0-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/bliss-0.2.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/redis-2.2.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/virtualenv-1.8.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/threadpool-1.2.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/uuid-1.30-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/python_gflags-2.0-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/httplib2-0.7.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/paramiko_on_pypi-1.7.6-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pycrypto_on_pypi-2.3-py2.7-linux-x86_64.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/python_hostlist-1.14-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/BigJob-0.4.128-py2.7.egg', '/opt/apps/python/epd/7.3.2/modules/lib/python', '/opt/apps/python/epd/7.3.2/lib', '/home1/01131/tg804093/.bigjob/python/lib/python27.zip', '/home1/01131/tg804093/.bigjob/python/lib/python2.7', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/plat-linux2', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-tk', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-old', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-dynload', '/opt/apps/python/epd/7.3.2/lib/python2.7', '/opt/apps/python/epd/7.3.2/lib/python2.7/plat-linux2', '/opt/apps/python/epd/7.3.2/lib/python2.7/lib-tk', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages', '/opt/apps/python/epd/7.3.2/lib/python2.7/site-packages', '/opt/apps/python/epd/7.3.2/lib/python2.7/site-packages/PIL', '/home1/01131/tg804093/src/BigJob/examples/../bigjob', '/home1/01131/tg804093/src/BigJob/examples/../pilot/impl/../..', '/home1/01131/tg804093/src/BigJob/examples/../pilot/filemanagement/../..'] 03/28/2013 02:42:56 PM - bigjob - WARNING - WebHDFS package not found. 03/28/2013 02:42:56 PM - bigjob - DEBUG - Security Context: None 03/28/2013 02:42:56 PM - bigjob - DEBUG - BigJob working directory: ssh://localhost//tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:56 PM - bigjob - DEBUG - Create directory: //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:56 PM - bigjob - DEBUG - ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o NumberOfPasswordPrompts=0 localhost mkdir //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:57 PM - bigjob - DEBUG - Run ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o NumberOfPasswordPrompts=0 localhost mkdir //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f Output: ["Warning: Permanently added 'localhost' (RSA) to the list of known hosts.\r\r\n"] 03/28/2013 02:42:57 PM - bigjob - WARNING - No file staging adaptor found. 03/28/2013 02:42:57 PM - bigjob - DEBUG - BJ Working Directory: /tmp 03/28/2013 02:42:57 PM - bigjob - DEBUG - Adaptor specific modifications: fork 03/28/2013 02:42:57 PM - bigjob - DEBUG - Escape Bliss 03/28/2013 02:42:57 PM - bigjob - DEBUG - "import sys import os import urllib import sys import time start_time = time.time() home = os.environ.get(\"HOME\")
print \"Home: \" + home
if home==None: home = os.getcwd() BIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\") if not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR) BIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\" if not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR) BOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\" BOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\"/bigjob-bootstrap.py\"
ensure that BJ in .bigjob is upfront in sys.path
sys.path.insert(0, os.getcwd() + \"/../\") p = list() for i in sys.path: if i.find(\".bigjob/python\")>1: p.insert(0, i) for i in p: sys.path.insert(0, i) print \"Python path: \" + str(sys.path) print \"Python version: \" + str(sys.version_info) try: import saga except: print \"SAGA and SAGA Python Bindings not found.\"; try: import bigjob.bigjob_agent except: print \"BigJob not installed. Attempt to install it.\"; opener = urllib.FancyURLopener({}); opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR os.system(\"/usr/bin/env\") try: os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); execfile(activate_this, dict(file=activate_this)) except: print \"BJ installation failed. Trying system-level python (/usr/bin/python)\"; os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); execfile(activate_this, dict(file=activate_this))
try to import BJ once again
import bigjob.bigjob_agent
execute bj agent
args = list() args.append(\"bigjob_agent.py\") args.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\") args.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\") args.append(\"\") print \"Bootstrap time: \" + str(time.time()-start_time) print \"Starting BigJob Agents with following args: \" + str(args) bigjob_agent = bigjob.bigjob_agent.bigjob_agent(args) " 03/28/2013 02:42:57 PM - bigjob - DEBUG - Working directory: /tmp Job Description: {'Executable' : '/usr/bin/env','WorkingDirectory' : '/tmp','SPMDVariation' : 'single','Queue' : 'normal','WallTimeLimit' : '10','Arguments' : '['python', '-c', '"import sys\nimport os\nimport urllib\nimport sys\nimport time\nstart_time = time.time()\nhome = os.environ.get(\"HOME\")\n#print \"Home: \" + home\nif home==None: home = os.getcwd()\nBIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\")\nif not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR)\nBIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\"\nif not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR)\nBOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\\"\nBOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\\"/bigjob-bootstrap.py\\"\n#ensure that BJ in .bigjob is upfront in sys.path\nsys.path.insert(0, os.getcwd() + \"/../\")\np = list()\nfor i in sys.path:\n if i.find(\".bigjob/python\")>1:\n p.insert(0, i)\nfor i in p: sys.path.insert(0, i)\nprint \"Python path: \" + str(sys.path)\nprint \"Python version: \" + str(sys.version_info)\ntry: import saga\nexcept: print \"SAGA and SAGA Python Bindings not found.\";\ntry: import bigjob.bigjob_agent\nexcept: \n print \"BigJob not installed. Attempt to install it.\"; \n opener = urllib.FancyURLopener({}); \n opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); \n print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR\n os.system(\"/usr/bin/env\")\n try:\n os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n except:\n print \"BJ installation failed. Trying system-level python (/usr/bin/python)\";\n os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n#try to import BJ once again\nimport bigjob.bigjob_agent\n# execute bj agent\nargs = list()\nargs.append(\"bigjob_agent.py\")\nargs.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\")\nargs.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\")\nargs.append(\"\")\nprint \"Bootstrap time: \" + str(time.time()-start_time)\nprint \"Starting BigJob Agents with following args: \" + str(args)\nbigjob_agent = bigjob.bigjob_agent.bigjob_agent(args)\n"']','TotalCPUCount' : '8',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - Creating pilot job with description: {'Executable' : '/usr/bin/env','WorkingDirectory' : '/tmp','SPMDVariation' : 'single','Queue' : 'normal','WallTimeLimit' : '10','Arguments' : '['python', '-c', '"import sys\nimport os\nimport urllib\nimport sys\nimport time\nstart_time = time.time()\nhome = os.environ.get(\"HOME\")\n#print \"Home: \" + home\nif home==None: home = os.getcwd()\nBIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\")\nif not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR)\nBIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\"\nif not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR)\nBOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\\"\nBOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\\"/bigjob-bootstrap.py\\"\n#ensure that BJ in .bigjob is upfront in sys.path\nsys.path.insert(0, os.getcwd() + \"/../\")\np = list()\nfor i in sys.path:\n if i.find(\".bigjob/python\")>1:\n p.insert(0, i)\nfor i in p: sys.path.insert(0, i)\nprint \"Python path: \" + str(sys.path)\nprint \"Python version: \" + str(sys.version_info)\ntry: import saga\nexcept: print \"SAGA and SAGA Python Bindings not found.\";\ntry: import bigjob.bigjob_agent\nexcept: \n print \"BigJob not installed. Attempt to install it.\"; \n opener = urllib.FancyURLopener({}); \n opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); \n print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR\n os.system(\"/usr/bin/env\")\n try:\n os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n except:\n print \"BJ installation failed. Trying system-level python (/usr/bin/python)\";\n os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n#try to import BJ once again\nimport bigjob.bigjob_agent\n# execute bj agent\nargs = list()\nargs.append(\"bigjob_agent.py\")\nargs.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\")\nargs.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\")\nargs.append(\"\")\nprint \"Bootstrap time: \" + str(time.time()-start_time)\nprint \"Starting BigJob Agents with following args: \" + str(args)\nbigjob_agent = bigjob.bigjob_agent.bigjob_agent(args)\n"']','Error' : '/tmp/stderr-bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f-agent.txt','Output' : '/tmp/stdout-bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f-agent.txt','TotalCPUCount' : '8',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - Submit pilot job to: fork://localhost Pilot Job/BigJob URL: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost State: Unknown 03/28/2013 02:42:57 PM - bigjob - DEBUG - add subjob to queue of PJ: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:42:57 PM - bigjob - DEBUG - create dictionary for job description. Job-URL: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:57 PM - bigjob - DEBUG - SJ Attributes: {'Executable' : '/bin/echo','Environment' : '['HELLOWORLD=hello_world']','Arguments' : '['$HELLOWORLD']','NumberOfProcesses' : '1','Error' : 'stderr.txt','Output' : 'stdout.txt',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - job dict: {'Executable': '/bin/echo', 'NumberOfProcesses': 1, 'Environment': ['HELLOWORLD=hello_world'], 'state': 'Unknown', 'Arguments': ['$HELLOWORLD'], 'Error': 'stderr.txt', 'Output': 'stdout.txt', 'job-id': 'sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f'} 03/28/2013 02:42:57 PM - bigjob - DEBUG - set job state to: Unknown 03/28/2013 02:42:57 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Unknown 03/28/2013 02:42:59 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Unknown 03/28/2013 02:43:01 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:03 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:05 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:07 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Done 03/28/2013 02:43:07 PM - bigjob - DEBUG - Cancel Pilot Job 03/28/2013 02:43:07 PM - bigjob - DEBUG - stop pilot job: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:43:07 PM - bigjob - DEBUG - update state of pilot job to: Done stopped: True 03/28/2013 02:43:07 PM - bigjob - DEBUG - delete pilot job: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:43:07 PM - bigjob - DEBUG - Cancel Pilot Job finished
On Wed, Mar 27, 2013 at 2:17 PM, Melissa notifications@github.com wrote:
This is more of a reminder to myself...
BigJob is not working with the new redis server at TACC, but it is ping-able and works within SAGA. This COULD be an issue with the parsing happening in SAGA URL, but I have to diagnose what is really going on.
If anyone who works these tickets would like to help me diagnose, please ping me for the new password.
— Reply to this email directly or view it on GitHub.
I have created a ticket for this: https://github.com/saga-project/BigJob/issues/90
This is somewhat related to this ticket: https://github.com/saga-project/BigJob/issues/80
I think better Redis error handling will definitely be one of the "Top 5 Bugs" to address in the next release.
On Mar 29, 2013, at 04:25 , Andre Luckow notifications@github.com wrote:
Ok, now I see the issue:
the Redis server is not reachable from the Stampede compute nodes:
(python)tg804093@c402-002.stampede ~$ telnet redis01.tacc.utexas.edu 6379 Trying 129.114.60.146...
@Yaakoub: Can you have a look at this?
Thanks, Andre
On Thu, Mar 28, 2013 at 8:45 PM, Andre Luckow andre.luckow@gmail.com wrote:
It works on Stampede as well:
03/28/2013 02:42:48 PM - bigjob - INFO - Loading BigJob version: 0.4.128 on login1.stampede.tacc.utexas.edu 03/28/2013 02:42:53 PM - bigjob - INFO - Using SAGA Bliss. Start Pilot Job/BigJob at: fork://localhost 03/28/2013 02:42:53 PM - bigjob - DEBUG - Utilizing Redis Backend 03/28/2013 02:42:53 PM - bigjob - DEBUG - Parsing URL: redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - redis:// redis01.tacc.utexas.edu 6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - Connect to Redis: redis01.tacc.utexas.edu Port: 6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - init BigJob w/: redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379 03/28/2013 02:42:55 PM - bigjob - DEBUG - initialized BigJob: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:55 PM - bigjob - DEBUG - create pilot job entry on backend server: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:42:55 PM - bigjob - DEBUG - update state of pilot job to: Unknown stopped: False 03/28/2013 02:42:55 PM - bigjob - DEBUG - update description of pilot job to: None 03/28/2013 02:42:55 PM - bigjob - DEBUG - set pilot state to: Unknown 03/28/2013 02:42:55 PM - bigjob - DEBUG - setting walltime to: 10 03/28/2013 02:42:55 PM - bigjob - DEBUG - Use SSH backend 03/28/2013 02:42:56 PM - bigjob - DEBUG - ['/home1/01131/tg804093/src/BigJob/pilot/filemanagement/../../../webhdfs-py/', '/home1/01131/tg804093/src/BigJob/examples/../', '/home1/01131/tg804093/src/BigJob/examples', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/distribute-0.6.34-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pexpect-2.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/simplejson-2.0.9-py2.7-linux-x86_64.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/boto-2.2.2-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/globusonline_transfer_api_client-0.10.13-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/google_api_python_client-1.0-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/bliss-0.2.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/redis-2.2.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/virtualenv-1.8.4-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/threadpool-1.2.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/uuid-1.30-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/python_gflags-2.0-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/httplib2-0.7.7-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/paramiko_on_pypi-1.7.6-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/pycrypto_on_pypi-2.3-py2.7-linux-x86_64.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/python_hostlist-1.14-py2.7.egg', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages/BigJob-0.4.128-py2.7.egg', '/opt/apps/python/epd/7.3.2/modules/lib/python', '/opt/apps/python/epd/7.3.2/lib', '/home1/01131/tg804093/.bigjob/python/lib/python27.zip', '/home1/01131/tg804093/.bigjob/python/lib/python2.7', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/plat-linux2', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-tk', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-old', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/lib-dynload', '/opt/apps/python/epd/7.3.2/lib/python2.7', '/opt/apps/python/epd/7.3.2/lib/python2.7/plat-linux2', '/opt/apps/python/epd/7.3.2/lib/python2.7/lib-tk', '/home1/01131/tg804093/.bigjob/python/lib/python2.7/site-packages', '/opt/apps/python/epd/7.3.2/lib/python2.7/site-packages', '/opt/apps/python/epd/7.3.2/lib/python2.7/site-packages/PIL', '/home1/01131/tg804093/src/BigJob/examples/../bigjob', '/home1/01131/tg804093/src/BigJob/examples/../pilot/impl/../..', '/home1/01131/tg804093/src/BigJob/examples/../pilot/filemanagement/../..'] 03/28/2013 02:42:56 PM - bigjob - WARNING - WebHDFS package not found. 03/28/2013 02:42:56 PM - bigjob - DEBUG - Security Context: None 03/28/2013 02:42:56 PM - bigjob - DEBUG - BigJob working directory: ssh://localhost//tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:56 PM - bigjob - DEBUG - Create directory: //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:56 PM - bigjob - DEBUG - ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o NumberOfPasswordPrompts=0 localhost mkdir //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:57 PM - bigjob - DEBUG - Run ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o NumberOfPasswordPrompts=0 localhost mkdir //tmp/bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f Output: ["Warning: Permanently added 'localhost' (RSA) to the list of known hosts.\r\r\n"] 03/28/2013 02:42:57 PM - bigjob - WARNING - No file staging adaptor found. 03/28/2013 02:42:57 PM - bigjob - DEBUG - BJ Working Directory: /tmp 03/28/2013 02:42:57 PM - bigjob - DEBUG - Adaptor specific modifications: fork 03/28/2013 02:42:57 PM - bigjob - DEBUG - Escape Bliss 03/28/2013 02:42:57 PM - bigjob - DEBUG - "import sys import os import urllib import sys import time start_time = time.time() home = os.environ.get(\"HOME\")
print \"Home: \" + home
if home==None: home = os.getcwd() BIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\") if not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR) BIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\" if not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR) BOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\" BOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\"/bigjob-bootstrap.py\"
ensure that BJ in .bigjob is upfront in sys.path
sys.path.insert(0, os.getcwd() + \"/../\") p = list() for i in sys.path: if i.find(\".bigjob/python\")>1: p.insert(0, i) for i in p: sys.path.insert(0, i) print \"Python path: \" + str(sys.path) print \"Python version: \" + str(sys.version_info) try: import saga except: print \"SAGA and SAGA Python Bindings not found.\"; try: import bigjob.bigjob_agent except: print \"BigJob not installed. Attempt to install it.\"; opener = urllib.FancyURLopener({}); opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR os.system(\"/usr/bin/env\") try: os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); execfile(activate_this, dict(file=activate_this)) except: print \"BJ installation failed. Trying system-level python (/usr/bin/python)\"; os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); execfile(activate_this, dict(file=activate_this))
try to import BJ once again
import bigjob.bigjob_agent
execute bj agent
args = list() args.append(\"bigjob_agent.py\") args.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\") args.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\") args.append(\"\") print \"Bootstrap time: \" + str(time.time()-start_time) print \"Starting BigJob Agents with following args: \" + str(args) bigjob_agent = bigjob.bigjob_agent.bigjob_agent(args) " 03/28/2013 02:42:57 PM - bigjob - DEBUG - Working directory: /tmp Job Description: {'Executable' : '/usr/bin/env','WorkingDirectory' : '/tmp','SPMDVariation' : 'single','Queue' : 'normal','WallTimeLimit' : '10','Arguments' : '['python', '-c', '"import sys\nimport os\nimport urllib\nimport sys\nimport time\nstart_time = time.time()\nhome = os.environ.get(\"HOME\")\n#print \"Home: \" + home\nif home==None: home = os.getcwd()\nBIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\")\nif not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR)\nBIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\"\nif not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR)\nBOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\\"\nBOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\\"/bigjob-bootstrap.py\\"\n#ensure that BJ in .bigjob is upfront in sys.path\nsys.path.insert(0, os.getcwd() + \"/../\")\np = list()\nfor i in sys.path:\n if i.find(\".bigjob/python\")>1:\n p.insert(0, i)\nfor i in p: sys.path.insert(0, i)\nprint \"Python path: \" + str(sys.path)\nprint \"Python version: \" + str(sys.version_info)\ntry: import saga\nexcept: print \"SAGA and SAGA Python Bindings not found.\";\ntry: import bigjob.bigjob_agent\nexcept: \n print \"BigJob not installed. Attempt to install it.\"; \n opener = urllib.FancyURLopener({}); \n opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); \n print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR\n os.system(\"/usr/bin/env\")\n try:\n os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n except:\n print \"BJ installation failed. Trying system-level python (/usr/bin/python)\";\n os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n#try to import BJ once again\nimport bigjob.bigjob_agent\n# execute bj agent\nargs = list()\nargs.append(\"bigjob_agent.py\")\nargs.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\")\nargs.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\")\nargs.append(\"\")\nprint \"Bootstrap time: \" + str(time.time()-start_time)\nprint \"Starting BigJob Agents with following args: \" + str(args)\nbigjob_agent = bigjob.bigjob_agent.bigjob_agent(args)\n"']','TotalCPUCount' : '8',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - Creating pilot job with description: {'Executable' : '/usr/bin/env','WorkingDirectory' : '/tmp','SPMDVariation' : 'single','Queue' : 'normal','WallTimeLimit' : '10','Arguments' : '['python', '-c', '"import sys\nimport os\nimport urllib\nimport sys\nimport time\nstart_time = time.time()\nhome = os.environ.get(\"HOME\")\n#print \"Home: \" + home\nif home==None: home = os.getcwd()\nBIGJOB_AGENT_DIR= os.path.join(home, \".bigjob\")\nif not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR)\nBIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\"/python/\"\nif not os.path.exists(BIGJOB_PYTHON_DIR): os.mkdir(BIGJOB_PYTHON_DIR)\nBOOTSTRAP_URL=\"https://raw.github.com/saga-project/BigJob/master/bootstrap/bigjob-bootstrap.py\\"\nBOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\\"/bigjob-bootstrap.py\\"\n#ensure that BJ in .bigjob is upfront in sys.path\nsys.path.insert(0, os.getcwd() + \"/../\")\np = list()\nfor i in sys.path:\n if i.find(\".bigjob/python\")>1:\n p.insert(0, i)\nfor i in p: sys.path.insert(0, i)\nprint \"Python path: \" + str(sys.path)\nprint \"Python version: \" + str(sys.version_info)\ntry: import saga\nexcept: print \"SAGA and SAGA Python Bindings not found.\";\ntry: import bigjob.bigjob_agent\nexcept: \n print \"BigJob not installed. Attempt to install it.\"; \n opener = urllib.FancyURLopener({}); \n opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); \n print \"Execute: \" + \"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR\n os.system(\"/usr/bin/env\")\n try:\n os.system(\"python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n except:\n print \"BJ installation failed. Trying system-level python (/usr/bin/python)\";\n os.system(\"/usr/bin/python \" + BOOTSTRAP_FILE + \" \" + BIGJOB_PYTHON_DIR); \n activate_this = os.path.join(BIGJOB_PYTHON_DIR, \"bin/activate_this.py\"); \n execfile(activate_this, dict(file=activate_this))\n#try to import BJ once again\nimport bigjob.bigjob_agent\n# execute bj agent\nargs = list()\nargs.append(\"bigjob_agent.py\")\nargs.append(\"redis://Oily9tourSorenavyvault@redis01.tacc.utexas.edu:6379\")\nargs.append(\"bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost\")\nargs.append(\"\")\nprint \"Bootstrap time: \" + str(time.time()-start_time)\nprint \"Starting BigJob Agents with following args: \" + str(args)\nbigjob_agent = bigjob.bigjob_agent.bigjob_agent(args)\n"']','Error' : '/tmp/stderr-bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f-agent.txt','Output' : '/tmp/stdout-bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f-agent.txt','TotalCPUCount' : '8',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - Submit pilot job to: fork://localhost Pilot Job/BigJob URL: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost State: Unknown 03/28/2013 02:42:57 PM - bigjob - DEBUG - add subjob to queue of PJ: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:42:57 PM - bigjob - DEBUG - create dictionary for job description. Job-URL: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f 03/28/2013 02:42:57 PM - bigjob - DEBUG - SJ Attributes: {'Executable' : '/bin/echo','Environment' : '['HELLOWORLD=hello_world']','Arguments' : '['$HELLOWORLD']','NumberOfProcesses' : '1','Error' : 'stderr.txt','Output' : 'stdout.txt',} 03/28/2013 02:42:57 PM - bigjob - DEBUG - job dict: {'Executable': '/bin/echo', 'NumberOfProcesses': 1, 'Environment': ['HELLOWORLD=hello_world'], 'state': 'Unknown', 'Arguments': ['$HELLOWORLD'], 'Error': 'stderr.txt', 'Output': 'stdout.txt', 'job-id': 'sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f'} 03/28/2013 02:42:57 PM - bigjob - DEBUG - set job state to: Unknown 03/28/2013 02:42:57 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Unknown 03/28/2013 02:42:59 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Unknown 03/28/2013 02:43:01 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:03 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:05 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Running 03/28/2013 02:43:07 PM - bigjob - DEBUG - Get subjob state: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost:jobs:sj-b0e5aeaa-97df-11e2-ab5e-d4ae52a0f02f state: Done 03/28/2013 02:43:07 PM - bigjob - DEBUG - Cancel Pilot Job 03/28/2013 02:43:07 PM - bigjob - DEBUG - stop pilot job: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:43:07 PM - bigjob - DEBUG - update state of pilot job to: Done stopped: True 03/28/2013 02:43:07 PM - bigjob - DEBUG - delete pilot job: bigjob:bj-afbad9ba-97df-11e2-ab5e-d4ae52a0f02f:localhost 03/28/2013 02:43:07 PM - bigjob - DEBUG - Cancel Pilot Job finished
On Wed, Mar 27, 2013 at 2:17 PM, Melissa notifications@github.com wrote:
This is more of a reminder to myself...
BigJob is not working with the new redis server at TACC, but it is ping-able and works within SAGA. This COULD be an issue with the parsing happening in SAGA URL, but I have to diagnose what is really going on.
If anyone who works these tickets would like to help me diagnose, please ping me for the new password.
Reply to this email directly or view it on GitHub. Reply to this email directly or view it on GitHub.
This is working now. Thanks all!
It won't let me close this issue, so please feel free to :)
This is more of a reminder to myself...
BigJob is not working with the new redis server at TACC, but the redis server is ping-able through the redis-cli.
If anyone who works these tickets would like to help me diagnose, please ping me for the new password.