MSO4SC / cloudify-hpc-plugin

Plugin to allow Cloudify to deploy and orchestrate HPC resources
Apache License 2.0
8 stars 8 forks source link

Can't undeploy when deployment fails #48

Closed emepetres closed 6 years ago

emepetres commented 6 years ago

@gdolle commented on Wed May 23 2018

reproduce steps:

Make the deployment fail: 1) create an instance by not selecting the hpc. 2) Deploy => fail 3) Undeploy/Force undeploy

2018-05-23T14:59:40.709Z Starting 'uninstall' workflow execution
2018-05-23T14:59:41.582Z Stopping node job_toolboxes_y2001r (job_toolboxes)
2018-05-23T14:59:41.932Z Sending task 'hpc_plugin.tasks.revert_job' job_toolboxes_y2001r (job_toolboxes)
2018-05-23T14:59:42.137Z Task started 'hpc_plugin.tasks.revert_job' job_toolboxes_y2001r (job_toolboxes)
2018-05-23T14:59:43.838Z Task failed 'hpc_plugin.tasks.revert_job' -> 'simulate' job_toolboxes_y2001r (job_toolboxes)
KeyError: 'simulate'
    Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 641, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 397, in handle
    result = self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/plugins/default_tenant/tutu-hpc/lib/python2.7/site-packages/hpc_plugin/tasks.py", line 266, in revert_job
    simulate = ctx.instance.runtime_properties['simulate']
KeyError: 'simulate'

2018-05-23T14:59:58.922Z Sending task 'hpc_plugin.tasks.revert_job' [retry 1/5] job_toolboxes_y2001r (job_toolboxes)
2018-05-23T14:59:59.012Z Task started 'hpc_plugin.tasks.revert_job' [retry 1/5] job_toolboxes_y2001r (job_toolboxes)
2018-05-23T15:00:00.720Z Task failed 'hpc_plugin.tasks.revert_job' -> 'simulate' [retry 1/5] job_toolboxes_y2001r (job_toolboxes)
KeyError: 'simulate'
    Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 641, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 397, in handle
    result = self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/plugins/default_tenant/tutu-hpc/lib/python2.7/site-packages/hpc_plugin/tasks.py", line 266, in revert_job
    simulate = ctx.instance.runtime_properties['simulate']
KeyError: 'simulate'

2018-05-23T15:00:15.782Z Sending task 'hpc_plugin.tasks.revert_job' [retry 2/5] job_toolboxes_y2001r (job_toolboxes)
2018-05-23T15:00:15.890Z Task started 'hpc_plugin.tasks.revert_job' [retry 2/5] job_toolboxes_y2001r (job_toolboxes)
2018-05-23T15:00:17.590Z Task failed 'hpc_plugin.tasks.revert_job' -> 'simulate' [retry 2/5] job_toolboxes_y2001r (job_toolboxes)
KeyError: 'simulate'
    Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 641, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 397, in handle
    result = self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/plugins/default_tenant/tutu-hpc/lib/python2.7/site-packages/hpc_plugin/tasks.py", line 266, in revert_job
    simulate = ctx.instance.runtime_properties['simulate']
KeyError: 'simulate'

2018-05-23T15:00:32.673Z Sending task 'hpc_plugin.tasks.revert_job' [retry 3/5] job_toolboxes_y2001r (job_toolboxes)
2018-05-23T15:00:32.765Z Task started 'hpc_plugin.tasks.revert_job' [retry 3/5] job_toolboxes_y2001r (job_toolboxes)
2018-05-23T15:00:34.466Z Task failed 'hpc_plugin.tasks.revert_job' -> 'simulate' [retry 3/5] job_toolboxes_y2001r (job_toolboxes)
KeyError: 'simulate'
    Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 641, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 397, in handle
    result = self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/plugins/default_tenant/tutu-hpc/lib/python2.7/site-packages/hpc_plugin/tasks.py", line 266, in revert_job
    simulate = ctx.instance.runtime_properties['simulate']
KeyError: 'simulate'

@emepetres commented on Fri May 25 2018

Apart from resolve the bug, I'll add an admin feature to clean old bronken instances and apps.