Open duglin opened 4 years ago
You can specify a ConfigMap/Secret mount with a specific path to an item in the ConfigMap. Pods will only start, once that item becomes available respectively. I don't quite recall if we're supporting that kind of a mount (yet).
That'd be a mechanism that lives purely in K8s, which I'd prefer over a custom mechanism on our side. There's also a concept called Initializers
in K8s, see https://ahmet.im/blog/initializers/ for example. They read like the inverse of a Finalizer
which would be exactly what you want for arbitrary initialization logic, that might not be surfaceable through ConfigMaps/Secrets.
re: configMap - yep I've thought about that and for quick demos it does seem to work. But at some point something gives up waiting. Not sure if its Knative or Kube or both, but I just tried letting it sit for a while and the Ksvc ends up in a "Failed" state eventually, and creating the CM after that time doesn't resurrect it. So, perhaps KnServing just needs to keep retrying forever? It might also need to not go into a "Failed" state, rather just sit in a "Pending" state.
re: Initializers - for some reason I thought people were moving away from those, but I suspect those might run into the same problem as configMaps w.r.t. something giving up after a while since the probes will probably fail
The exact error messages would be helpful, but I guess this will be the deadline we set on a deployment to become healthy. We only wait for an arbitrary amount of time today until we call failure.
Status:
Conditions:
Last Transition Time: 2020-04-14T12:34:10Z
Message: Revision "echo-m759q" failed with message: configmap "mycm" not found.
Reason: RevisionFailed
Status: False
Type: ConfigurationsReady
Which makes sense.
Oh yeah, my point above wasn't about the entire ConfigMap not being there but a key of it missing that explicitly state in the mount config.
ah, I don't think that would work because the app would still come up and then it would be up to it to watch for the CM entry (and only by vol mount, not env vars, which some people want to use) and I'd like to make this transparent for the user.
Issues go stale after 90 days of inactivity.
Mark the issue as fresh by adding the comment /remove-lifecycle stale
.
Stale issues rot after an additional 30 days of inactivity and eventually close.
If this issue is safe to close now please do so by adding the comment /close
.
Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.
/lifecycle stale
/remove-lifecycle stale
/lifecycle frozen
@duglin I haven't read through this in detail, but is this something you want to discuss at an upcoming API WG meeting?
Yep I think that would be a good idea - just let me know which week you think would good for it.
@mattmoor gonna add this to this week's agenda... ran into another situation where this might be useful
We're running into a situation where we need to disable an account. This means that we don't want to delete the user's CRs, but we do want to delete the running pods associated with it. We also don't want to modify the user's CRs (eg set things to max-scale=0) because then we'd need to restore the old values once the account is re-enabled. Possible to do, but not ideal.
While brainstorming on ways to achieve this, this issue/feature came up and if this feature were viewed more as a "pause" state then it might be able to work more multiple scenarios. For the "pre-start hook" scenario, it would prevent the pods from even starting until something moved the ksvc out of "paused" state. For the "disable the account" scenario, the controller would view the "paused" state as a signal to not only not bring up new pods but also shutdown existing ones.
Just adding another aspect to the discussion.
Not sure yet what disabling the account would do for eventing resources. The obvious answer is to delete secondary resources associated with the CRs (e.g. unsubscribe from github), but more thinking is needed to ensure that won't cause unintended harm.
cc @n3wscott as this topic of suspecting sources has come up before 🤔
*Suspending
Yes, a pause feature to allow you to build up a "hot" cluster and be able to switch over to it and turn the sources on and have it ready to go. A disaster recovery type cluster.
That would be awesome - we have a similar need - the domain mapping stuff was part of that need too :-)
Pause is on the roadmap for sources in the "soon" category, like 6 months.
Got any kind of design doc yet?
We're running into a situation where we need to disable an account. This means that we don't want to delete the user's CRs, but we do want to delete the running pods associated with it.
If you set a ResourceQuota to allow zero Pods, will it terminate running Pods?
@scothis looks like no according to the docs ("Neither contention nor changes to quota will affect already created resources."), but you could probably set one and then stop all pods in a namespace to achieve a ~similar effect.
Another possible need for a pre-start hook... build. While I suspect that many people think of "build" and "create ksvc" as two separate steps, and they're done sequentially, it actually makes it kind of awkward in a UI where from the user's POV they've asked for a ksvc to be created based on a yet-to-be-built image.
Abstractly, if there's a UX that presents something like:
service create echo --image myco/echo --source github.com/myco/echo
which means create an echo
service from the source code in github/myco/echo
.
Now, it's very possible that the service
command could orchestrate everything and only create the underlying ksvc after the build is done, however, since the ksvc won't be visible until late in the process it means the user doesn't know that the echo
ksvc is "pending" if they list the ksvcs in the system. It would be nice to create the ksvc
but don't let it actually deploy the revision until it was "unpaused" (meaning the image is ready).
Another interesting option here is to allow for kServices to not have any revisions - while it's not the same thing as "pausing a revision", it does solve the "pausing a ksvc" side of it.
Related to the previous comment... not creating the ksvc until the build is done means that we can't "reserve" the ksvc name until that time either. So, someone else could grab our ksvc name while our build is happening - even though it was available when the first use initiated the "service create" command.
A mechanism like this would allow for any kind of "pausing" regardless of the reason (building, service bindings, async checks, simply reserving the name for later user....). This issue isn't about supporting just "build" but rather supporting a "pausing" mechanism.
https://github.com/k8s-service-bindings/spec seems to have an example of the service-binding type pattern that Doug was talking about.
It seems like this issue has changed from "run some code before creating a Revision" to "be able to pause resource reconciliation for an arbitrary amount of time for specific resources". (I agree with julz and scothis that using ResourceQuota in combination with killing Pods seems like the better way to restrict running containers for an account.)
This seems like a pretty big feature space -- a concrete proposal for specific use cases would probably help narrow it down if we want to take action (vs hold open a feature request).
/kind proposal /triage needs-user-input
Not sure the issue has changed, it's about being able to do some work before the revision is reconciled. I'm not sure Quota can help here since it's not about pausing ALL Revisions, it's about pausing ONE Revision. In terms of usecases, the two main ones we have are:
But obviously, the mechanism should support any logic/reason. As I mentioned, I view this similar to a Finalizer, which pauses the delete until code is run. Except this is pausing the creation until some code is run.
See https://github.com/knative/serving/issues/7575#issuecomment-717902765 for why doing this above ksvc/revision is problematic.
@duglin in concrete terms, are you proposing something like a spec.Paused field that causes the reconciler to ignore Ksvc changes if it's true? (I think a Paused field would be consistent with regular k8s Deployments which can be paused, so that seems reasonable to me at least, but I'm really not clear from this issue if that's what you're proposing or if there's something more/different being asked for)
I think it might need to be more than a single field. In the first comment I talk about how this is similar to a Finalizer, but on create. So my prototype is along the lines of:
serving.knative.dev/pause-XXX: YYY
, and if so then "return"And that's pretty much it. Then some other component can add labels like service.knative.dev/pause-service-binding: myservice
during the creation of the ksvc yaml. And there could be more than one since there may be multiple reasons for the pause. And some other component and look for these labels on the revision and remove it when it's processed whatever logic is associated with that label.
And, I made it a label instead of an annotation so that the controller watching for these labels can easily find/watch for them.
Feature proposal doc: https://docs.google.com/document/d/1GkVceSS_v1TCkG_gccl4w3DwgmKt_to3VsbBXyqRe28/edit# for our chat during next week's call.
/remove-triage needs-user-input
/triage accepted
In what area(s)?
/area API
Other classifications: /kind spec
Describe the feature
The notion of an application needing to connect to an external service is a common one. In the past Kube had the ServiceBroker project for this, and today there are other projects out there that are working on a similar set of features. That basic idea is that the environment tries to help the app author by creating a new instance of a service and then provides the connection info (url, creds, ...) via a secret - which is then mounted as a volume, or env vars, into the app.
When the app is already running, the user (or the service broker type of logic) can modify the config of the app (or ksvc) to include the pointer to the secret. However, the problem becomes harder when the user doesn't want the app to come up at all until the secret is attached to it because then it might generate failures while it's waiting for the secret to be attached.
This feature request is NOT to ask for ServiceBroker type of support in Knative. This feature request is for the ability for an admin to config some kind of pre-launch hook for a Revision so that they can add whatever logic they want before the KnServing controller tries to deploy the Revision.
This extensibility point could then be used for other purposes, but our immediate pain point is around Serving Bindings.
Some additional comments/thoughts:
--annotation serving-binding=dugsMySQL
. And since we don't want to block the update/creation of the ksvc in etcd, we need some way to pause the serving controller so it waits until this work is done before it generates the Deployment for the Revision. In some ways this is sort of like the opposite of a Kube FinalizerJust looking for some brainstorming on this topic for now...