turbonomic / t8c-install

23 stars 35 forks source link

Enabling kubeturbo in XL instance yaml does not seem to work & even prevent other probes to deploy on 8.3.5 #19

Closed fckbo closed 2 years ago

fckbo commented 2 years ago

Hello, as indicated in the title, I'm struggling when trying to enable kubeturbo on my Turbonomic 8.3.5 installation.

Actions performed with success

1) I deployed the Turbonomic Platform Operator from the Operator Hub on an Openshift cluster 4.6.44 2) I then created an XL instance with OpenShift Ingress enabled 3) Install succeed, I then entered a license in Turbo. 4) I updated the XL yaml adding

  vcenter:
    enabled: true

5) mediation venter & mediation venterrowsning pod were installed and I could add a VCenter Target in the Turbo UI 6) I updated the XL yaml to add azure & aws probe and like for the Venter case, new pods were deployed & I can see the new target category in the Turbo UI

Problem/Actions that did not work

When trying to install the Kubeturbo probe using the following statement in the XL yaml

  kubeturbo:
    enabled: true

at step 2) above then no pods would deploy and Overall Turbo install would not succeed at step 4) or step 6) above then no additional probe would be deployed at all

P.S. I'm wondering if I should install the kubeturbo operator first & separately to get this to work, but I see nothing in the documentation mandating this to get the turbo probe install to succeed ?

Context:

oc version
Client Version: 4.9.0
Server Version: 4.6.44
Kubernetes Version: v1.19.0+4c3480d
oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.44    True        False         15d     Cluster version is 4.6.44
oc get nodes
NAME                            STATUS   ROLES    AGE   VERSION
master1.hmcmp02.nca.ihost.com   Ready    master   15d   v1.19.0+4c3480d
master2.hmcmp02.nca.ihost.com   Ready    master   15d   v1.19.0+4c3480d
master3.hmcmp02.nca.ihost.com   Ready    master   15d   v1.19.0+4c3480d
worker1.hmcmp02.nca.ihost.com   Ready    worker   15d   v1.19.0+4c3480d
worker2.hmcmp02.nca.ihost.com   Ready    worker   15d   v1.19.0+4c3480d
worker3.hmcmp02.nca.ihost.com   Ready    worker   15d   v1.19.0+4c3480d

I'm attaching the log of the Turbonomic operator to this issue t8c-pb-kubeturbo.log

Thx

esara commented 2 years ago

@fckbo please use the kubeturbo certified operator to deploy kubeturbo in your openshift cluster https://catalog.redhat.com/software/operators/detail/5e9872763f398525a0ceafb2

fckbo commented 2 years ago

@esara Hello I saw that you closed the issue, I must say I'm a bit surprise when:

  1. based on the description above, it clearly shows that installing Turbonomic Platform Operator with a documented option:
    kubeturbo:
    enabled: true

    leads to a failure.

  2. There is nothing in the documentation (at least that I could find) saying that if you want to install a central Turbonomic platform (which is what I'm doing in this case) and then configure all sorts of probes, including one to target openshift clusters you need to install kubeturbo on the central platform as well ? BTW that central platfrom will have Vcenter, AWS, Azure targets configured as well as openshift clusters so kubeturbo will be installed on this managed clusters. Actually I have already installed one kubeturbo on the first managed cluster that I intend to target from this central platform...

So may I ask if you could confirm: 1) I need to install kubeturbo also on the underlying openshift cluster on which I'm deploying the central Turbonomic platform in order to get the option kubeturbo enabled to work and then later be able to configure managed clusters as targets (like I had mentioned in my Post Scriptum above) ?

2) if that's the case, any part of the documentation that mention this that you could point me to ? In case there is none, I think, it might help others to update the documentation or/and to have a clear error msg indicating that the XL creation fails because of this ?

Thx

fckbo commented 2 years ago

Hello, fyi I installed Kubeturbo operator on the cluster on which I deployed the 'central' turbonomic server, I actually did deploy it in 'kubeturbo' namespace. The Turbonomic Platform Operator was deployed in the 'turbonomic' namespace. I had the following error in the turbonomic operator log after adding in the xl-release yaml:

kubeturbo:
    enabled: true
oc get pods
NAME                                         READY   STATUS    RESTARTS   AGE
action-orchestrator-6986ff5476-vjgrp         1/1     Running   0          3d15h
api-5bd95c4c57-dh9ht                         1/1     Running   0          3d15h
auth-758fb8f4-2zqtm                          1/1     Running   1          3d15h
clustermgr-68ffc9bb74-bxdz6                  1/1     Running   0          3d15h
consul-64df86769-kkfzc                       1/1     Running   0          3d15h
cost-5f45bdf948-txb7h                        1/1     Running   0          3d15h
db-5ff77c5496-t9v52                          1/1     Running   0          3d15h
group-68755cf6f8-dzmnm                       1/1     Running   0          3d15h
history-6bd7f4c9bd-5n7gb                     1/1     Running   0          3d15h
kafka-68c79b4d46-bcvm8                       1/1     Running   1          3d15h
market-676f9db4-qnz9d                        1/1     Running   0          3d15h
mediation-aws-b77dff87b-tx54c                1/1     Running   0          3d13h
mediation-awsbilling-5c985994cb-rwl25        1/1     Running   0          3d13h
mediation-awscost-776795bc7d-vg8mn           1/1     Running   0          3d13h
mediation-azure-7f7c8d74d7-j82zw             1/1     Running   0          3d13h
mediation-azurecost-7c96df4999-9ch7g         1/1     Running   0          3d13h
mediation-azureea-f98497ff7-w44kj            1/1     Running   0          3d13h
mediation-azuresp-6fbc5c5755-9pkvq           1/1     Running   0          3d13h
mediation-azurevolumes-84454f8dc9-82nfn      1/1     Running   0          3d13h
mediation-vcenter-df96c55f7-44672            1/1     Running   0          3d13h
mediation-vcenterbrowsing-697f9bbb45-gjmgl   1/1     Running   0          3d13h
nginx-5675cd7d97-mk5qq                       1/1     Running   0          3d15h
plan-orchestrator-f79dbf5bd-mg7cv            1/1     Running   0          3d15h
repository-669cdc7469-h8q2h                  1/1     Running   0          3d15h
rsyslog-6865fb958d-x6h8z                     1/1     Running   0          3d15h
t8c-operator-64f86cbffb-l5mq4                1/1     Running   0          3d15h
topology-processor-77c57d55d9-gfwmp          1/1     Running   0          3d15h
ui-795f49499c-9pqxg                          1/1     Running   0          3d15h
zookeeper-9f777c987-2g7qx                    1/1     Running   0          3d15h

oc logs t8c-operator-64f86cbffb-l5mq4

I1108 13:56:00.052998       1 request.go:665] Waited for 1.043060728s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/hive.openshift.io/v1?timeout=32s
{"level":"error","ts":1636379763.995823,"logger":"helm.controller","msg":"Failed to sync release","namespace":"turbonomic","name":"xl-release","apiVersion":"charts.helm.k8s.io/v1alpha1","kind":"Xl","release":"xl-release","error":"failed to get candidate release: rendered manifests contain a resource that already exists. Unable to continue with update: could not get information about the resource: clusterrolebindings.rbac.authorization.k8s.io \"turbo-all-binding\" is forbidden: User \"system:serviceaccount:turbonomic:t8c-operator\" cannot get resource \"clusterrolebindings\" in API group \"rbac.authorization.k8s.io\" at the cluster scope","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227"}
{"level":"error","ts":1636379763.9989088,"logger":"helm.controller","msg":"Failed to update status after sync release failure","namespace":"turbonomic","name":"xl-release","apiVersion":"charts.helm.k8s.io/v1alpha1","kind":"Xl","release":"xl-release","error":"xls.charts.helm.k8s.io \"xl-release\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227"}
{"level":"error","ts":1636379763.9990005,"logger":"controller.xl-controller","msg":"Reconciler error","name":"xl-release","namespace":"turbonomic","error":"failed to get candidate release: rendered manifests contain a resource that already exists. Unable to continue with update: could not get information about the resource: clusterrolebindings.rbac.authorization.k8s.io \"turbo-all-binding\" is forbidden: User \"system:serviceaccount:turbonomic:t8c-operator\" cannot get resource \"clusterrolebindings\" in API group \"rbac.authorization.k8s.io\" at the cluster scope","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227"}

FYI here are the rolebindings in the namespaces mentionned above:

oc get rolebinding -n turbonomic                                                  
NAME                                          ROLE                                               AGE
admin                                         ClusterRole/admin                                  3d15h
system:deployers                              ClusterRole/system:deployer                        3d15h
system:image-builders                         ClusterRole/system:image-builder                   3d15h
system:image-pullers                          ClusterRole/system:image-puller                    3d15h
t8c-operator                                  Role/t8c-operator                                  3d15h
t8c-operator.v42.3.0-t8c-operator-5557dc4cc   Role/t8c-operator.v42.3.0-t8c-operator-5557dc4cc   3d15h
oc get rolebinding -n kubeturbo 
NAME                                                      ROLE                                                           AGE
kubeturbo-operator.v8.3.5-kubeturbo-operator-5c97498c78   Role/kubeturbo-operator.v8.3.5-kubeturbo-operator-5c97498c78   70m
system:deployers                                          ClusterRole/system:deployer                                    86m
system:image-builders                                     ClusterRole/system:image-builder                               86m
system:image-pullers                                      ClusterRole/system:image-puller                                86m
esara commented 2 years ago

@fckbo please remove the kubeturbo option from the xls (platform) cr - you are now managing kubeturbo using the kubeturbo-operator

fckbo commented 2 years ago

@endre, Thx a lot for the short session today, so as a summary:

  1. The following option is not really needed on the cluster on which the Turbonomic Platform Operator is installed unless this cluster should also be monitored in Turbonomic

    kubeturbo:
    enabled: true
  2. Installing the Kubeturbo operator on the cluster on which the Turbonomic Platform Operator is installed is also not needed unless this cluster should also be monitored in Turbonomic

  3. The Kubeturbo option does not show up like the VCenter of the public Cloud ones when clicking the "NEW TARGET" button in the Settings->Target Configurations Turbonomic UI.

Bottom line is that thx to your hints, I removed the Kubeturbo operator from the central platform cluster & removed the kubeturbo, enable: true from the configuration of the xl-release yaml as initially I did not want to manage this cluster from Turbonomic and then I could connect another managed cluster after installing the Kubeturbo operator on it & creating an instance properly configured to connect to my central Turbonomic server.

fckbo commented 2 years ago

To finish, on this I checked that I could also indeed deploy KubeTurbo from the Operator Hub on the central cluster where Turbonomic Platform Operator was also installed and configure it to point to the Turbonomic server without using the option:

kubeturbo:
    enabled: true

So thx again for your support

mrulke commented 2 years ago

what is the use case for having the kubeturbo in the t8c-operator?

esara commented 2 years ago

only for internal demos, not for production