Open tmrts opened 8 years ago
So for the rktnetes, I am not sure we really want to redesign the gc, as the kubelet will enforce it's gc policy, and all it needs is the RemoveContainer
interface.
I think we can keep today's gc for cleaning up at pod levels?
As previously discussed in #2932, gc by entrypoints seems to be enough for our purposes.
After our discussions AFAICT we don't need any changes to garbage collection semantics in rkt.
@tmrts @yifan-gu Just trying to imagine how the workflow will be (considering that rkt is not based on a central daemon model):
I was under the impression that extending the current rkt pod lifecycle also to single apps will help on handling all possible problems like the ones handled at the pod level by the pod lifecycle.
Additionally not doing this will tie the per app handling to an external model enforced by the k8s interface and this won't work when doing per app management with "just" rkt.
@sgotti Good question especially when an app is a crashloop but the pod is still running.... I guess we will need k8s to handle this as it is the one who creates the crashloop.
In other cases, we can let rkt gc
to remove them when it removes the whole pod.
@yifan-gu that case covers an application crashing, but what will happen when rkt has some problems preparing/starting/stopping the app? My impression is that using pods like lifecycle (with the assumption that an app when failed cannot be restarted but needs to be trashed and recreated) will ease and prpbably required by the rkt model.
Perhaps I'm missing something and if this is really needed will be discovered when implementing/testing #2932 .
Perhaps we should strongly encourage (and default?) to having a data directory specific to Kubernetes. This would allow the rktnetes code to nuke any pod it can't recognize vs the current state of things where we have to carefully classify "owned" vs "not owned" pods and avoid nuking ones we don't know about.
@euank I think that this issue should also cover how to handle garbage for pods not managed by k8s.
Instead I see this related #3029 (my understanding was that a missing pod manifest can't help distinguishing a k8s managed pod from other pods). I'm not sure how to handle upgrades if the new version changes the datadir since it'll leave old k8s pods unmanaged.
With our move towards exposing application level operations and dynamic pods (see #2375 #2867 #2932), we should consider whether we need to modify our garbage collection model to be run during the execution.
Introduction of app-level operations means that we allow users to run pets instead of cattles, which are basically long-running pods, and when applications are ejected/removed out of a pod, after a while, uncollected garbage might be problematic.
cc @euank @yifan-gu @coreos/rkt-maintainers