datawire / forge

Define and run multi-container apps in Kubernetes
http://forge.sh
Apache License 2.0
415 stars 43 forks source link

Transient failure on delete #210

Closed rhs closed 6 years ago

rhs commented 6 years ago
Traceback (most recent call last):
  File "/usr/local/bin/forge/.bootstrap/_pex/pex.py", line 367, in execute
  File "/usr/local/bin/forge/.bootstrap/_pex/pex.py", line 293, in _wrap_coverage
  File "/usr/local/bin/forge/.bootstrap/_pex/pex.py", line 325, in _wrap_profiling
  File "/usr/local/bin/forge/.bootstrap/_pex/pex.py", line 410, in _execute
  File "/usr/local/bin/forge/.bootstrap/_pex/pex.py", line 468, in execute_entry
  File "/usr/local/bin/forge/.bootstrap/_pex/pex.py", line 486, in execute_pkg_resources
  File "/Users/marcostrijker/.pex/install/Forge-0.4.14-py2-none-any.whl.63858fdb8d27988ceecb45e084ec96af43afe95a/Forge-0.4.14-py2-none-any.whl/forge/cli.py", line 370, in call_main
    exit(forge())
  File "/Users/marcostrijker/.pex/install/click-6.7-py2.py3-none-any.whl.6d9ff910081ac14222b6215822bc2664662de745/click-6.7-py2.py3-none-any.whl/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/marcostrijker/.pex/install/click-6.7-py2.py3-none-any.whl.6d9ff910081ac14222b6215822bc2664662de745/click-6.7-py2.py3-none-any.whl/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/marcostrijker/.pex/install/click-6.7-py2.py3-none-any.whl.6d9ff910081ac14222b6215822bc2664662de745/click-6.7-py2.py3-none-any.whl/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/marcostrijker/.pex/install/click-6.7-py2.py3-none-any.whl.6d9ff910081ac14222b6215822bc2664662de745/click-6.7-py2.py3-none-any.whl/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/marcostrijker/.pex/install/click-6.7-py2.py3-none-any.whl.6d9ff910081ac14222b6215822bc2664662de745/click-6.7-py2.py3-none-any.whl/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/marcostrijker/.pex/install/click-6.7-py2.py3-none-any.whl.6d9ff910081ac14222b6215822bc2664662de745/click-6.7-py2.py3-none-any.whl/click/decorators.py", line 27, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "/Users/marcostrijker/.pex/install/Forge-0.4.14-py2-none-any.whl.63858fdb8d27988ceecb45e084ec96af43afe95a/Forge-0.4.14-py2-none-any.whl/forge/tasks.py", line 252, in __call__
    return result.get()
  File "/Users/marcostrijker/.pex/install/Forge-0.4.14-py2-none-any.whl.63858fdb8d27988ceecb45e084ec96af43afe95a/Forge-0.4.14-py2-none-any.whl/forge/executor.py", line 409, in do_run
    result.value = fun(*args, **kwargs)
  File "/Users/marcostrijker/.pex/install/Forge-0.4.14-py2-none-any.whl.63858fdb8d27988ceecb45e084ec96af43afe95a/Forge-0.4.14-py2-none-any.whl/forge/cli.py", line 347, in delete
    repos = kube.list()
  File "/Users/marcostrijker/.pex/install/Forge-0.4.14-py2-none-any.whl.63858fdb8d27988ceecb45e084ec96af43afe95a/Forge-0.4.14-py2-none-any.whl/forge/tasks.py", line 252, in __call__
    return result.get()
  File "/Users/marcostrijker/.pex/install/Forge-0.4.14-py2-none-any.whl.63858fdb8d27988ceecb45e084ec96af43afe95a/Forge-0.4.14-py2-none-any.whl/forge/executor.py", line 409, in do_run
    result.value = fun(*args, **kwargs)
  File "/Users/marcostrijker/.pex/install/Forge-0.4.14-py2-none-any.whl.63858fdb8d27988ceecb45e084ec96af43afe95a/Forge-0.4.14-py2-none-any.whl/forge/kubernetes.py", line 212, in list
    endpoints[(namespace, name)] = i["subsets"]
  File "/Users/marcostrijker/.pex/install/Forge-0.4.14-py2-none-any.whl.63858fdb8d27988ceecb45e084ec96af43afe95a/Forge-0.4.14-py2-none-any.whl/forge/yamlutil.py", line 144, in __getitem__
    raise KeyError(key)
KeyError: 'subsets'

When doing forge delete tps Aaaaaand I don't get it :joy:

rhs [9:17 AM] What version?

Octopixell [9:19 AM] forge 0.4.14 It seemed to happen when a pod was pending because of not enough available resources (CPU and RAM) I wanted to revert the deploy by deleting it but that was no longer possible because of this error

rhs [9:21 AM] Ah, so it was a transient error?

Octopixell [9:21 AM] Not sure what you mean After I ran forge deploy to deploy the project, I noticed the pod was stuck on Pending on the cluster as there weren't enough resources available to run it... I figured deleting the deploy would be best for now so I ran forge delete projectname and then received the above error instead.. After scaling back some other apps freeing up resources,.. the delete call once again functioned normally also, yea indeed it's a transient error that occurs as I just described :slightly_smiling_face:

rhs [9:26 AM] Ok, I think I have an idea where the problem is. I'm on the train, but will see if I can turn around a fix soon.

Octopixell [9:28 AM] Sure thing thanks for looking into it @rhs :slightly_smiling_face:

rhs commented 6 years ago

This should be fixed in 0.4.15