Open SHoen opened 1 year ago
this is a must fix ... otherwise we will not be able to deploy anything outside the scope of an ods component (e.g. operator) @michaelsauter @jafarre-bi @metmajer
otherwise we will not be able to deploy anything outside the scope of an ods component (e.g. operator)
Meaning you do not want to label the resources of the operator? Or even that the operator resources themselves should somehow be deployed by other means, outside ODS?
I'd think that if an operator is installed and provides a CRD, you'd want to deploy that CRD somehow via ODS. For that, the deploying serviceaccount needs permissions. If you have the permissions, you should also be able to list the resources, and, if it has an app=myproject
label, apply all the other labels to the resource. @SHoen what is the exact situation where this error occurs? Is there some operator resource that you do not want to roll out with ODS?
Still, the all
in the labelling command is tricky, and that feels related to @serverhorror's comment in the Helm PR on labelling, see https://github.com/opendevstack/ods-jenkins-shared-library/pull/916#issuecomment-1247279040.
@michaelsauter Thank you for your comment. Yes, an operator installed some CRDs, which are not related to ODS a ods project and we don't want to deploy them via ODS. Unfortunately this CRDs blocked ODS on the cluster because jenkins didn't have the permission to "list"? some of these components and failed on the labeling command outlined in the command section of my bug report.
Yes, an operator installed some CRDs, which are not related to ODS a ods project and we don't want to deploy them via ODS.
I think that somehow goes against the implicit assumption in ODS that it manages everything that is deployed. How does this play together with the docs generated by the release manager? Do the resources deployed by the operator not show up at all? Is that intended?
If it is intended, I think the smallest change possible is listing explicitly which resources to apply labels to.
Indeed this is a very good question. Think of some resources that might be already qualified in a different way. I mean including them could be a new feature but for now the issue is, that the pipeline was failing on them.
TL;DR: You're right. I think the issue is that we label things from the client side. I don't know of a good way to resolve this reliably.
Thoughts/Discussion Points below -- just as starting point for my perspective:
@michaelsauter
Still, the
all
in the labelling command is tricky, and that feels related to @serverhorror's comment in the Helm PR on labelling, see #916 (comment).
Absolutely agree. Labeling from the client side is always going to be a problem. By client side I mean: As long as we use oc
commands in any pipeline and rely on that there is no good and reliable way to even assume that things are managed by the framework.
It's akin' to having some API and a JS driven frontend where the only input validation is happening on the JS side and nothing is verified on the server side.
goes against the implicit assumption in ODS that it manages everything that is deployed
While this assumption exists, and I am in favor of it being actually true, that's all it is: An assumption.
People are using oc
directly and ODS provides no way to have these resources under control.
If it is intended, I think the smallest change possible is listing explicitly which resources to apply labels to.
Will that solve our problem thou?
We will change less things
@SHoen
Unfortunately this CRDs blocked ODS on the cluster because jenkins didn't have the permission to "list"? some of these
I think that's kind of expected. Currently the only point that "manages" what is in the cluster, from an ODS perspective, is Jenkins. It's all client side and we have no server side verification that something is compliant.
In General
As long as we overwrite well-known labels we will stay in the trouble zone. Those labels are well-known, meaning that most authors of Kubernetes resources know them and will -- nay are supposed to -- make use of them.
Us changing these labels requires that we also MUST change every other occurrence of that string to match up and fullfill any kind of contract that expects the label values to match up.
As soon as the labeling happens
oc label --overwrite all -l app=myproject-my-app \
-n myproject-dev \
app.kubernetes.io/name=my-app \
app.kubernetes.io/instance- \
app.kubernetes.io/part-of- \
app.kubernetes.io/managed-by=tailor \
app.openshift.io/runtime-version- \
helm.sh/chart- \
app.opendevstack.org/project=myproject
I expect these problems to happen in case of any kind of helm chart or externally maintained resources:
labels
and labelSelectors
will not match up any longer leading to orphaned resources (pods mostly, in case of CRDs it might be other things)immutable field
problemsIf it is intended, I think the smallest change possible is listing explicitly which resources to apply labels to.
Will that solve our problem thou?
No it won't solve the general problem that you describe (and I agree with your view). But it may unblock for now, leaving the current system with its flaws in place. Doing the small change should buy time to work on the general topic on the side.
@SHoen what are you trying to achieve? @michaelsauter's observation with ODS' Release Manager managing the deployment of an entire system is spot-on. Unless you plan to install services into the -cd namespace to support. Otherwise, this requires a larger discussion.
@metmajer @michaelsauter - assuming an operator is involved which deploys custom resources, roles etc .. I doubt everything managed by ODS will solve this .. or am I lost somewhere
@clemensutschig Not sure I get your comment correctly. I tried to say that the release manager has the implicit assumption that it manages everything that is deployed. An operator managing its own resources conflicts with this idea in my view. A middle ground may be that the operator potentially provides an option to specify resource labels. Still, the documentation that would be produced by the RM would not reflect what was actually deployed ...
Describe the bug The following command for labeling the components fails because non-ods related resources exist in the namespace.
Command oc label --overwrite all -l app=myproject-my-app -n myproject-dev app.kubernetes.io/name=my-app app.kubernetes.io/instance- app.kubernetes.io/part-of- app.kubernetes.io/managed-by=tailor app.openshift.io/runtime-version- helm.sh/chart- app.opendevstack.org/project=myproject
Error Error from server (Forbidden): nopermission.some.resource.com is forbidden: User "system:serviceaccount:myproject-cd:jenkins" cannot list resource "nopermission" in API group "some.resource.com" in the namespace "myproject-dev"
To Reproduce Steps to reproduce the behavior:
Expected behavior ODS would only try to label ODS related components and would not fail if there exist some resources where jenkins doesn't have permission.
Affected version (please complete the following information):