Open adrianreber opened 3 years ago
/sig node
Discussion Link: N/A (or... at multiple conferences during the last years when presenting CRIU and container migration, there was always the question when will we see container migration in Kubernetes)
Responsible SIGs: maybe node
We recommend actively socializing your KEP with the appropriate sig to gain visibility, consensus and also for scheduling. Also as you are not sure of what SIG will sponsor this, reaching out to the SIGs to get clarity on that will be helpful to move your KEP forward.
Hi @adrianreber
Any updates on whether this will be included in 1.20?
Enhancements Freeze is October 6th and by that time we require:
The KEP must be merged in an implementable state The KEP must have test plans The KEP must have graduation criteria The KEP must have an issue in the milestone
Best, Kirsten
Hello @kikisdeliveryservice
Any updates on whether this will be included in 1.20?
Sorry, but how would I decide this? There has not been a lot of feedback on the corresponding KEP which makes it really difficult for me to answer that question. On the other hand, maybe the missing feedback is a good sign that it will take some more time. So probably this will not be included in 1.20.
Normally the sig would give a clear signal that it would be included. That would be by : reviewing the KEP, agreeing to the milestone proposals in the KEP etc.. I'd encourage you to keep in touch with them and start the 1.21 conversation early if this does not end up getting reviewed/merged properly by October 6th.
Best, Kirsten
@kikisdeliveryservice Thanks for the guidance. Will do.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-contributor-experience at kubernetes/community. /close
@fejta-bot: Closing this issue.
/reopen /remove-lifecycle rotten
@adrianreber: Reopened this issue.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/remove-lifecycle stale
Still working on it.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
/remove-lifecycle stale
Hello @adrianreber 👋, 1.24 Enhancements team here.
Just checking in as we approach enhancements freeze on 18:00pm PT on Thursday Feb 3rd, 2022. This enhancement is targeting for stage alpha
for 1.24, is this correct?
Here’s where this enhancement currently stands:
implementable
for this releaseLooks like for this one, we would just need to update the following:
alpha
kep.yaml
file in the open PR to add a status implementable
At the moment, the status of this enhancement is track as at risk
. Please keep the issue description up-to-date with appropriate stages. Thank you!
@Priyankasaggu11929 Thanks for the KEP feedback. I tried to update the KEP to address the open issues you listed.
@adrianreber, thanks so much for the quickly updates the PR. 🚀
With #1990 merged, I've updated this enhancement to tracked
for the 1.24 cycle. All set for enhancements freeze. Thanks!
Hi @adrianreber :wave: 1.24 Docs lead here.
This enhancement is marked as Needs Docs for the 1.24 release.
Please follow the steps detailed in the documentation to open a PR against the dev-1.24
branch in the k/website
repo. This PR can be just a placeholder at this time and must be created before Thursday, March 31st, 2022 @ 18:00 PDT.
Also, if needed take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.
Thanks!
@nate-double-u documentation PR available at https://github.com/kubernetes/website/pull/31753
Hi @adrianreber :wave: 1.24 Release Comms team here.
We have an opt-in process for the feature blog delivery. If you would like to publish a feature blog for this issue in this cycle, then please opt in on this tracking sheet.
The deadline for submissions and the feature blog freeze is scheduled for 01:00 UTC Wednesday 23rd March 2022 / 18:00 PDT Tuesday 22nd March 2022. Other important dates for delivery and review are listed here: https://github.com/kubernetes/sig-release/tree/master/releases/release-1.24#timeline.
For reference, here is the blog for 1.23.
Please feel free to reach out any time to me or on the #release-comms channel with questions or comments.
Thanks!
Hi @adrianreber
I'm checking in as we approach 1.24 code freeze at 01:00 UTC Wednesday 30th March 2022.
Please ensure the following items are completed:
For this KEP, it looks like just k/k#104907 needs to be merged. Are there any other PRs that you think we should be tracking that would be subject to the 1.24 code freeze?
Let me know if you have any questions.
@rhockenbury There are no other PRs that need to be tracked.
Friendly reminder to try to merge k/k#104907 before code freeze at 01:00 UTC Wednesday 30th March 2022.
KEP update PR for 1.25 https://github.com/kubernetes/enhancements/pull/3264
Hello @adrianreber👋, 1.25 Enhancements team here.
Just checking in as we approach enhancements freeze on 18:00 PST on Thursday June 16, 2022.
For note, This enhancement is targeting for stage alpha
for 1.25 (correct me, if otherwise)
Here's where this enhancement currently stands:
implementable
Looks like for this one, we would just need to update the following:
For note, the status of this enhancement is marked as at risk
. Please keep the issue description up-to-date with appropriate stages as well. Thank you!
@parul5sahoo see #3406 for the updated test plan
Hello @adrianreber 👋, 1.25 Enhancements team here.
Just checking in as we approach enhancements freeze on 18:00 PST on Thursday June 23, 2022.
For note, This enhancement is targeting for stage alpha
for 1.25 (correct me, if otherwise)
Here’s where this enhancement currently stands:
implementable
for this releaseWith all the KEP requirements in place, this enhancement is all good for the upcoming enhancements freeze once that PR gets merged. 🚀
For note, the status of this enhancement is marked as at risk
and will be marked as tracked
as soon as the PR gets merged. Please keep the issue description up-to-date with appropriate stages as well. Thank you!
Hello @adrianreber , the KEP is marked tracked
and is ready for Enhacements freeze :rocket:
Opened docs PR at https://github.com/kubernetes/website/pull/34940
👋 Hey @adrianreber,
Enhancements team checking in as we approach 1.25 code freeze at 01:00 UTC on Wednesday, 3rd August 2022.
Please ensure the following items are completed by code freeze: [ ] All PRs to the Kubernetes repo that are related to your enhancement are linked in the above issue description (for tracking purposes). [x] All PRs are fully merged by the code freeze deadline.
Looks like there is one merged PR in k/k. Let me know if I missed any other PRs that need to be tracked.
As always, we are here to help should questions come up. Thanks!!
@rhockenbury https://github.com/kubernetes/kubernetes/pull/104907 is the only PR and it is merged.
I think it'd be useful if the kubelet annotated Pods with the timestamp of their last checkpoint.
If we add native support for restores, we could additionally manage a Pod annotation with the timestamp of the last restore, and perhaps with some details of the checkpoint data that was restored.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
I think it'd be useful if the kubelet annotated Pods with the timestamp of their last checkpoint.
If we add native support for restores, we could additionally manage a Pod annotation with the timestamp of the last restore, and perhaps with some details of the checkpoint data that was restored.
If restored and checkpointed is now tracked at least by CRI-O on the container level: https://github.com/cri-o/cri-o/pull/6464
Hi @adrianreber, I've recently seen your great work on FOSDEM here I have two question would like to discuss:
for this, I think there would be some approach like we just add some annotation like checkpoint.kubernetes.io/checkpoint-container=<container_name>
and stuffs to letting kubelet to handle the checkpoint automatically.
RestoreContainer(context.Context, *Container, string, string) error
, however in containerd, I saw the createContainer interface was leveraged to handle restore situation, if my understanding is correct, is there any reason that implemented in this way? Or is any plan that to have another symmetry interface Restore
in containerd? Appreciated for your answer
Hi @adrianreber, I've recently seen your great work on FOSDEM
Thanks.
here I have two question would like to discuss:
1. since for now we only can support checkpoint after we scheduled it to one exact node and we must know where the pods container, is there any design we can call kube-apiserver to do the checkpoint?
One of the main reasons to make it as a kubelet only API endpoint in the beginning is that we wanted to be careful as checkpointing is something new in Kubernetes. One of the problems is that now there is the possibility to have all memory pages, including potential sensitive information, on disk. It can only be access by root, but the checkpoint can be moved to some other location and the sensitive information could leak. One thing we are currently exploring is to encrypt the checkpoint to avoid this. We are still looking at what is the best way to expose this at the apiserver level.
for this, I think there would be some approach like we just add some annotation like
checkpoint.kubernetes.io/checkpoint-container=<container_name>
and stuffs to letting kubelet to handle the checkpoint automatically.
I do not really understand what you are suggesting.
2. I've notice that in cri-o, you've implement the interface for `RestoreContainer(context.Context, *Container, string, string) error`, however in containerd, I saw the createContainer interface was leveraged to handle restore situation, if my understanding is correct, is there any reason that implemented in this way? Or is any plan that to have another symmetry interface `Restore` in containerd?
Not sure at this point. The current PR to expose the CRI checkpoint changes in containerd (https://github.com/containerd/containerd/pull/6965) is open for almost 10 months and there has not been much feedback. One of the problems is that the checkpoint archive format is not standardized and although there is a proposal (https://github.com/opencontainers/image-spec/issues/962) there is not much happening there.
If you want to checkpoint container from Kubernetes CRI-O is currently the best CRI implementation.
@adrianreber
Thanks for the details. We are very interested in this story and we will definitely help feature testing in containerd and contribute if there's chance. One quick question on the restore process, seems in 1.25 it's not directly implemented in kubernetes. We hope the pod can be created from the restore which leverages capabilities from contained layer. What's the known issues or bottleneck from your perspective?
One quick question on the restore process, seems in 1.25 it's not directly implemented in kubernetes.
If you look at https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ you can see how it is possible to restore containers in Kubernetes by adding the checkpoint archive to an OCI image. This way you can tell Kubernetes to create a container from that checkpoint image and the resulting container will be a restore.
We hope the pod can be created from the restore which leverages capabilities from contained layer.
Not sure what you mean here.
We hope the pod can be created from the restore which leverages capabilities from contained layer. Not sure what you mean here.
I can add more details. The way in https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ is an implicit way and kubernetes actually don't know the magic and it relies on the underneath container runtime to detect the image spec.
The other user journey could be explicit way. Kubelet can detect the perceive the snapshot and eventually invoke some restore path through CRI. It leaves the flexibility at the kubernetes layer to do lots of things, for example, schedule to node already with original image and it can only apply diff to get started. Have you evaluated explicit way in your original design?
apiVersion: v1
kind: Pod
metadata:
namePrefix: example-
...
annotations:
app.kubernetes.io/snapshot-image: xxxxxx
...
We hope the pod can be created from the restore which leverages capabilities from contained layer. Not sure what you mean here.
I can add more details. The way in https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ is an implicit way and kubernetes actually don't know the magic and it relies on the underneath container runtime to detect the image spec.
The other user journey could be explicit way. Kubelet can detect the perceive the snapshot and eventually invoke some restore path through CRI. I leaves the flexibility at the kubernetes layer to do lots of things, for example, schedule to node already with original image and it can only apply diff to get started.
I see no difference between the two ways you described. The checkpoint OCI image is only the checkpoint data and nothing else. The base image, the image the container is based on, is not part of it. As implemented in CRI-O, the base image will be pulled from the registry if missing. So I see no difference based on what you are describing. The automatic early pulling of the base image would not be possible, that is correct.
The other reason to do it implicitly is, that adding additional interfaces to the CRI takes a lot of time and as it was possible to solve it without an additional CRI call, it seemed the easier solution.
If we would be talking about checkpointing and restoring pods, I think it would be necessary to have and explicit interface in the CRI. For containers I do not think it is necessary.
Have you evaluated explicit way in your original design?
It feels like I have implemented almost everything to test it out initially :wink:
Hi! @adrianreber I am interested in your checkpoint/restore in Kubernetes project, so I recently attempted to use this feature, but encountered some issues. Could you please help me? it 's very important to me.
Description: I followed your demo video https://fosdem.org/2023/schedule/event/container_kubernetes_criu/ and official documentation https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ to perform checkpoint and restore operations on your container "quay.io/adrianreber/counter". Initially, everything appeared to be working fine, but when I tried to restore the counter container on another node in my k8s cluster using a YAML file, the restored container would enter an error state within 1 second of entering the running state. Like this:
What I did: I attempted to debug kubelet and discovered that after the Pod was restored, an old Pod's cgroup file (such as "kubepods-besteffort-pod969bc448_d138_4131_ad8d_344d1cb78b40.slice") was generated in the "/sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice" directory. However, the Pod associated with this cgroup was not running on the destination node, so kubelet would delete the directory, causing the restored container process to exit and resulting in a Pod Error.
My question is: why did this issue occur, and could it be a version compatibility issue?
My versions: Ubuntu 22.04 kubelet 1.26.0 cri-o 1.26.3 criu 3.17.1 (https://build.opensuse.org/project/show/devel:tools:criu)
@Qiubabimuniuniu are you using cgroup v1 or v2 on your systems? There might be still a bug in CRI-O when using checkpoint/restore on cgroup v2 systems.
@adrianreber I'm using cgroup v2. Thank you very much!!
I will try switching to use cgroup v1 and see if the problem can be resolved. Recently, I have been attempting to modify the kubelet and containerd source code to support "checkpoint/restore in Kubernetes". However, I encountered the same issue after completing the development. Additionally, while trying your project, I found that when using cri-o, I also encountered the same problem when performing checkpoint and restore operations. This problem has been bothering me for a long time. Thank you very much for your solution.
Thank you very much for your solution.
It is not really a solution. I should fix CRI-O to correctly work with checkpoint and restore on cgroup v2 systems.
Enhancement Description
Documentation
https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/
https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/
https://kubernetes.io/blog/2023/03/10/forensic-container-analysis/
[x] Alpha (1.25)
[x] KEP (
k/enhancements
) update PR(s):[x] Code (
k/k
) update PR(s):[x] Docs (
k/website
) update(s):[ ] Beta (1.30)
[x] KEP (
k/enhancements
) update PR(s):[x] Code (
k/k
) update PR(s):[ ] Docs (
k/website
) update(s):[ ] https://github.com/kubernetes/enhancements/pull/4305
[ ] https://github.com/kubernetes/kubernetes/pull/120898
Abandoned PR: https://github.com/kubernetes/kubernetes/pull/115888