Closed alberttwong closed 1 year ago
This may be the answer. https://access.redhat.com/solutions/6973378
oc edit roles/starrocks-leader-election-role
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"Role","metadata":{"annotations":{},"name":"starrocks-leader-election-role","namespace":"starrocks"},"rules":[{"apiGroups":[""],"resources":["configmaps"],"verbs":["get","list","watch","create","update","patch","delete"]},{"apiGroups":["coordination.k8s.io"],"resources":["leases"],"verbs":["get","list","watch","create","update","patch","delete"]},{"apiGroups":[""],"resources":["events"],"verbs":["create","patch"]}]}
creationTimestamp: "2023-04-28T22:21:11Z"
name: starrocks-leader-election-role
namespace: starrocks
resourceVersion: "41334"
uid: 0cbad80c-20b7-4ca0-8351-cea2cb632c81
rules:
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- list
- watch
- create
- update
- patch
It has the correct verbs.
since I created a new SA, starrocks-sa, I applied the following yaml
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: ' starrocks-leader-election-role'
namespace: starrocks
uid: 460e759c-5f1b-4c16-b3ac-3a146e5d100e
resourceVersion: '65355'
creationTimestamp: '2023-04-28T23:39:43Z'
managedFields:
- manager: Mozilla
operation: Update
apiVersion: rbac.authorization.k8s.io/v1
time: '2023-04-28T23:39:43Z'
fieldsType: FieldsV1
fieldsV1:
'f:roleRef': {}
'f:subjects': {}
subjects:
- kind: ServiceAccount
name: starrocks-sa
namespace: starrocks
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: starrocks-leader-election-role
Now I'm getting a different error related to cluster roles.
I0428 23:42:03.698913 1 request.go:601] Waited for 1.04014614s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/rbac.authorization.k8s.io/v1?timeout=32s
1.6827253250015438e+09 INFO controller-runtime.metrics Metrics server is starting to listen {"addr": ":8080"}
1.682725325002093e+09 INFO setup starting manager
I0428 23:42:05.002386 1 leaderelection.go:248] attempting to acquire leader lease starrocks/c6c79638.starrocks.com...
1.6827253250023947e+09 INFO Starting server {"kind": "health probe", "addr": "[::]:8081"}
1.682725325002437e+09 INFO Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
I0428 23:42:22.353634 1 leaderelection.go:258] successfully acquired lease starrocks/c6c79638.starrocks.com
1.6827253423537867e+09 INFO Starting EventSource {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster", "source": "kind source: *v1.StarRocksCluster"}
1.682725342353757e+09 DEBUG events starrocks-controller-cf78b5cb-lz6hp_708fb4fa-55ee-4c23-852a-bff29f983aed became leader {"type": "Normal", "object": {"kind":"Lease","namespace":"starrocks","name":"c6c79638.starrocks.com","uid":"6ccb58cd-43c3-4e8e-a7ae-9a273c98b91c","apiVersion":"coordination.k8s.io/v1","resourceVersion":"66212"}, "reason": "LeaderElection"}
1.682725342353829e+09 INFO Starting EventSource {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster", "source": "kind source: *v1.StatefulSet"}
1.6827253423538373e+09 INFO Starting EventSource {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster", "source": "kind source: *v1.Service"}
1.6827253423538406e+09 INFO Starting Controller {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster"}
W0428 23:42:22.355509 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.0/tools/cache/reflector.go:169: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:starrocks:starrocks-sa" cannot list resource "services" in API group "" at the cluster scope
W0428 23:42:22.355513 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.0/tools/cache/reflector.go:169: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:starrocks:starrocks-sa" cannot list resource "statefulsets" in API group "apps" at the cluster scope
E0428 23:42:22.355554 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.0/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:starrocks:starrocks-sa" cannot list resource "services" in API group "" at the cluster scope
E0428 23:42:22.355556 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.0/tools/cache/reflector.go:169: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:starrocks:starrocks-sa" cannot list resource "statefulsets" in API group "apps" at the cluster scope
W0428 23:42:22.355944 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.0/tools/cache/reflector.go:169: failed to list *v1.StarRocksCluster: starrocksclusters.starrocks.com is forbidden: User "system:serviceaccount:starrocks:starrocks-sa" cannot list resource "starrocksclusters" in API group "starrocks.com" at the cluster scope
E0428 23:42:22.355965 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.0/tools/cache/reflector.go:169: Failed to watch *v1.StarRocksCluster: failed to list *v1.StarRocksCluster: starrocksclusters.starrocks.com is forbidden: User "system:serviceaccount:starrocks:starrocks-sa" cannot list resource "starrocksclusters" in API group "starrocks.com" at the cluster scope
W0428 23:42:23.273141 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.0/tools/cache/reflector.go:169: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:starrocks:starrocks-sa" cannot list resource "services" in API group "" at the cluster scope
since I created a new SA, starrocks-sa, I applied the following yaml to grant cluster role
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: starrocks-manager
managedFields:
- manager: Mozilla
operation: Update
apiVersion: rbac.authorization.k8s.io/v1
fieldsType: FieldsV1
fieldsV1:
'f:roleRef': {}
'f:subjects': {}
subjects:
- kind: ServiceAccount
name: starrocks-sa
namespace: starrocks
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: starrocks-manager
now I'm getting the following error.
I0428 23:46:56.066560 1 request.go:601] Waited for 1.028188158s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/k8s.cni.cncf.io/v1?timeout=32s
1.6827256173696938e+09 INFO controller-runtime.metrics Metrics server is starting to listen {"addr": ":8080"}
1.6827256173699167e+09 INFO setup starting manager
I0428 23:46:57.370185 1 leaderelection.go:248] attempting to acquire leader lease starrocks/c6c79638.starrocks.com...
1.6827256173701913e+09 INFO Starting server {"kind": "health probe", "addr": "[::]:8081"}
1.6827256173702197e+09 INFO Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
I0428 23:47:12.663146 1 leaderelection.go:258] successfully acquired lease starrocks/c6c79638.starrocks.com
1.6827256326632936e+09 INFO Starting EventSource {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster", "source": "kind source: *v1.StarRocksCluster"}
1.6827256326633294e+09 INFO Starting EventSource {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster", "source": "kind source: *v1.StatefulSet"}
1.6827256326633346e+09 INFO Starting EventSource {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster", "source": "kind source: *v1.Service"}
1.6827256326633377e+09 INFO Starting Controller {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster"}
1.6827256326632776e+09 DEBUG events starrocks-controller-cf78b5cb-4zxmp_12b9357e-b47c-4624-a44e-e81b2d2ae64d became leader {"type": "Normal", "object": {"kind":"Lease","namespace":"starrocks","name":"c6c79638.starrocks.com","uid":"6ccb58cd-43c3-4e8e-a7ae-9a273c98b91c","apiVersion":"coordination.k8s.io/v1","resourceVersion":"67848"}, "reason": "LeaderElection"}
1.6827256327643864e+09 INFO Starting workers {"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster", "worker count": 1}
I0428 23:47:12.764483 1 starrockscluster_controller.go:85] StarRocksClusterReconciler reconcile the update crd name starrockscluster-sample namespace starrocks
I0428 23:47:12.765009 1 statefulset.go:135] the statefulset name starrockscluster-sample-fe new hash value 3203758280 old have value 3203758280
I0428 23:47:12.765041 1 k8sutils.go:77] ApplyStatefulSEt Sync exist statefulset name=starrockscluster-sample-fe, namespace=starrocks, equals to new statefuslet.
I0428 23:47:12.765066 1 k8sutils.go:52] CreateOrUpdateService service Name, Ports, Selector, ServiceType, Labels have not change namespace starrocks name starrockscluster-sample-fe-search
I0428 23:47:12.765085 1 k8sutils.go:52] CreateOrUpdateService service Name, Ports, Selector, ServiceType, Labels have not change namespace starrocks name starrockscluster-sample-fe-service
I0428 23:47:12.966937 1 be_controller.go:156] BeController UpdateStatus the statefulset name=starrockscluster-sample-be is not found.
so, now what is the problem with operator crd yamls. I can't get your point. you can communicate with Kevin.cai
stateful set has permission issues.
starrocks 23s Warning FailedCreate statefulset/starrockscluster-sample-fe create Pod starrockscluster-sample-fe-0 in StatefulSet starrockscluster-sample-fe failed error: pods "starrockscluster-sample-fe-0" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{1000}: 1000 is not an allowed group, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
trying to change service account
atwong@Alberts-MBP ~ % oc set sa deploy starrockscluster-sample-fe starrocks-sa
Error from server (NotFound): deployments.apps "starrockscluster-sample-fe" not found
I don't see anything odd in the statefulset yaml
kind: StatefulSet
apiVersion: apps/v1
metadata:
annotations:
app.starrocks.components/hash: '3203758280'
resourceVersion: '100795'
name: starrockscluster-sample-fe
uid: 8994ad02-97ed-45df-ac74-a2d541dfcdd8
creationTimestamp: '2023-05-05T23:50:02Z'
generation: 1
managedFields:
- manager: sroperator
operation: Update
apiVersion: apps/v1
time: '2023-05-05T23:50:02Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
.: {}
'f:app.starrocks.components/hash': {}
'f:finalizers':
.: {}
'v:"starrocks.com.statefulset/protection"': {}
'f:labels':
.: {}
'f:app.kubernetes.io/component': {}
'f:app.starrocks.ownerreference/name': {}
'f:ownerReferences':
.: {}
'k:{"uid":"7b92fb2d-cd76-4336-ad1e-9c6afc4d6ba1"}': {}
'f:spec':
'f:podManagementPolicy': {}
'f:replicas': {}
'f:revisionHistoryLimit': {}
'f:selector': {}
'f:serviceName': {}
'f:template':
'f:metadata':
'f:labels':
.: {}
'f:app.kubernetes.io/component': {}
'f:app.starrocks.ownerreference/name': {}
'f:name': {}
'f:namespace': {}
'f:spec':
'f:containers':
'k:{"name":"fe"}':
'f:image': {}
'f:startupProbe':
.: {}
'f:failureThreshold': {}
'f:periodSeconds': {}
'f:successThreshold': {}
'f:tcpSocket':
.: {}
'f:port': {}
'f:timeoutSeconds': {}
'f:volumeMounts':
.: {}
'k:{"mountPath":"/opt/starrocks/fe/log"}':
.: {}
'f:mountPath': {}
'f:name': {}
'k:{"mountPath":"/opt/starrocks/fe/meta"}':
.: {}
'f:mountPath': {}
'f:name': {}
'f:terminationMessagePolicy': {}
.: {}
'f:resources':
.: {}
'f:requests':
.: {}
'f:cpu': {}
'f:memory': {}
'f:args': {}
'f:lifecycle':
.: {}
'f:preStop':
.: {}
'f:exec':
.: {}
'f:command': {}
'f:command': {}
'f:livenessProbe':
.: {}
'f:failureThreshold': {}
'f:periodSeconds': {}
'f:successThreshold': {}
'f:tcpSocket':
.: {}
'f:port': {}
'f:timeoutSeconds': {}
'f:env':
'k:{"name":"HOST_TYPE"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"POD_IP"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef': {}
'k:{"name":"FE_SERVICE_NAME"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"POD_NAME"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef': {}
.: {}
'k:{"name":"USER"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"HOST_IP"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef': {}
'k:{"name":"COMPONENT_NAME"}':
.: {}
'f:name': {}
'f:value': {}
'k:{"name":"POD_NAMESPACE"}':
.: {}
'f:name': {}
'f:valueFrom':
.: {}
'f:fieldRef': {}
'f:readinessProbe':
.: {}
'f:failureThreshold': {}
'f:periodSeconds': {}
'f:successThreshold': {}
'f:tcpSocket':
.: {}
'f:port': {}
'f:timeoutSeconds': {}
'f:terminationMessagePath': {}
'f:imagePullPolicy': {}
'f:ports':
.: {}
'k:{"containerPort":8030,"protocol":"TCP"}':
.: {}
'f:containerPort': {}
'f:name': {}
'f:protocol': {}
'k:{"containerPort":9020,"protocol":"TCP"}':
.: {}
'f:containerPort': {}
'f:name': {}
'f:protocol': {}
'k:{"containerPort":9030,"protocol":"TCP"}':
.: {}
'f:containerPort': {}
'f:name': {}
'f:protocol': {}
'f:name': {}
'f:dnsPolicy': {}
'f:restartPolicy': {}
'f:schedulerName': {}
'f:securityContext':
.: {}
'f:fsGroup': {}
'f:fsGroupChangePolicy': {}
'f:terminationGracePeriodSeconds': {}
'f:volumes':
.: {}
'k:{"name":"fe-log"}':
.: {}
'f:emptyDir': {}
'f:name': {}
'k:{"name":"fe-meta"}':
.: {}
'f:emptyDir': {}
'f:name': {}
'f:updateStrategy':
'f:rollingUpdate':
.: {}
'f:partition': {}
'f:type': {}
namespace: starrocks
ownerReferences:
- apiVersion: starrocks.com/v1
kind: StarRocksCluster
name: starrockscluster-sample
uid: 7b92fb2d-cd76-4336-ad1e-9c6afc4d6ba1
controller: true
blockOwnerDeletion: true
finalizers:
- starrocks.com.statefulset/protection
labels:
app.kubernetes.io/component: fe
app.starrocks.ownerreference/name: starrockscluster-sample
spec:
replicas: 3
selector:
matchLabels:
app.kubernetes.io/component: fe
app.starrocks.ownerreference/name: starrockscluster-sample-fe
template:
metadata:
name: starrockscluster-sample-fe
namespace: starrocks
creationTimestamp: null
labels:
app.kubernetes.io/component: fe
app.starrocks.ownerreference/name: starrockscluster-sample-fe
spec:
volumes:
- name: fe-meta
emptyDir: {}
- name: fe-log
emptyDir: {}
containers:
- resources:
requests:
cpu: '4'
memory: 16Gi
readinessProbe:
tcpSocket:
port: 9030
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
lifecycle:
preStop:
exec:
command:
- /opt/starrocks/fe_prestop.sh
name: fe
command:
- /opt/starrocks/fe_entrypoint.sh
livenessProbe:
tcpSocket:
port: 9020
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: COMPONENT_NAME
value: fe
- name: FE_SERVICE_NAME
value: starrockscluster-sample-fe-service.starrocks
- name: POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: HOST_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
- name: HOST_TYPE
value: FQDN
- name: USER
value: root
ports:
- name: http-port
containerPort: 8030
protocol: TCP
- name: rpc-port
containerPort: 9020
protocol: TCP
- name: query-port
containerPort: 9030
protocol: TCP
imagePullPolicy: IfNotPresent
startupProbe:
tcpSocket:
port: 8030
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 60
volumeMounts:
- name: fe-meta
mountPath: /opt/starrocks/fe/meta
- name: fe-log
mountPath: /opt/starrocks/fe/log
terminationMessagePolicy: File
image: 'starrocks/fe-ubuntu:2.5.4'
args:
- $(FE_SERVICE_NAME)
restartPolicy: Always
terminationGracePeriodSeconds: 120
dnsPolicy: ClusterFirst
securityContext:
fsGroup: 1000
fsGroupChangePolicy: OnRootMismatch
schedulerName: default-scheduler
serviceName: starrockscluster-sample-fe-search
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 0
revisionHistoryLimit: 10
status:
replicas: 0
atwong@Alberts-MBP ~ % oc adm policy add-scc-to-user privileged -z starrocks-sa
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "starrocks-sa"
@alberttwong did you get through this now?
@dengliu is able to bring up the starrocks cluster on openshift with a few workaround. check the following issue for detailed info: https://github.com/StarRocks/starrocks-kubernetes-operator/issues/120
@kevincai I haven't been able to get it up with any instruction. @dengliu actually isn't using the operator but doing a helm chart install.
please wait for our next release, we will refine the finalizer design so you won't get so many troubles.
I tried a bunch of different methods. The only way I could work was with the helm chart. I couldn't get the operator to work on openshfit.
How did we get here. https://github.com/StarRocks/starrocks/discussions/22767
executed
now I get this error in the starrocks-controller pod