Closed devdattakulkarni closed 3 months ago
Seems like the applyPolicies()
function (line 822) in mutating-webhook/webhook.go is currently processing such requests, reading the requested amounts of CPU and memory, storing them in patchOperation objects and returning a list of said objects. Can we make these checks in that function and only return this list if the checks pass?
If not there, then trackCustomAPIs()
also seems to be reading these values and storing them in customAPIQuotaMap, and handleCustomAPIs()
reads this map for these values, but neither forces the Helm chart to define values nor do they seem to deny the request if cluster is at capacity. Let me know where the correct place to write these checks is, and I will get started!
@omgoswami Ack. Those are the right starting points. Let me look at the code and I will add more comments after that.
@omgoswami So I thought about this issue more and I think that rather than adding any checks in KubePlus for resource requests and limits, we should just indicate Pod statues in the output of kubectl appresources
kubectl plugin. This plugin discovers all the resources that have been created as part of a particular app deployment. We can add a 'status' column to the output and include the status of each resource.
See the main issue right now is there is no simple way to know whether the application Pods have started running or not. Users have to run kubectl get pods
. We do have kubectl metrics
plugin in whose output we do indicate how many Pods are running. But it is not intuitive that one needs to use kubectl metrics to find the status of the Pods. kubectl appresources
is something that we already tell users to use to find out about all the resources that have been created. So it will be natural place to add the status for each resource.
Originally I was thinking that KubePlus should check the capacity and deny the requests. BUT from user point of view, it is not easy to decide what values to define for resource requests and limits. Moreover, if there is a Custom Resource as part of the application's helm chart and if that custom resource is creating Pods, then users may not have any control over the requests/limits for those Pods. Therefore, I think we should not go down the route of tracking and enforcing capacity. Instead, providing status output will be the right thing to do.
@omgoswami As we discussed, lets add kubectl appstatus
as a new plugin. Its inputs will be similar to that of kubectl appresources
. The output will consists:
kubectl get WordPressService wp1 -o json
-> status from this output) and kubectl get pods -n wp1
-> Pod statues from this output).You can add kubectl-appstatus
bash file as the entrypoint of the plugin. For retrieving statues, it might be easier to use python. Check kubectl appresources
for how to invoke python script from within bash script. In fact, you can follow the kubectl appresources
with similar error checks, etc.
Consider the situation where KubePlus receives a request to create an application instance, but there is no available capacity on the cluster. In this case, KubePlus should deny such a request. Currently, KubePlus will handle the request but the application Pods will remain Pending if there is not enough available capacity on the cluster.
This feature will require two things:
Performing both checks in mutating webhook can be tricky since there is a strict 30-second timeout window for mutating webhook actions. One option can be to provide a kubectl plugin to perform the first check. However, the use of this plugin cannot be enforced. So probably the best place to perform the checks will still be the mutating webhook.