Closed angelbarrera92 closed 6 years ago
@angelbarrera92 could you share yml file?
Sure!
apiVersion: v1
kind: Template
labels:
template: zookeeper
metadata:
annotations:
description: Zookeeper Deployment and Runtime Components
iconClass: icon-java
tags: java,zookeeper
name: zookeeper
objects:
- apiVersion: v1
kind: Service
metadata:
labels:
application: ${APPLICATION_NAME}
name: ${APPLICATION_NAME}-headless
spec:
clusterIP: None
portalIP: None
ports:
- name: server
port: 2888
protocol: TCP
targetPort: 2888
- name: leader-election
port: 3888
protocol: TCP
targetPort: 3888
selector:
application: ${APPLICATION_NAME}
sessionAffinity: None
type: ClusterIP
- apiVersion: v1
kind: Service
metadata:
labels:
application: ${APPLICATION_NAME}
name: ${APPLICATION_NAME}
spec:
ports:
- name: client
port: 2181
protocol: TCP
targetPort: 2181
selector:
application: ${APPLICATION_NAME}
sessionAffinity: None
type: ClusterIP
- apiVersion: v1
data:
init.sh: |-
#!/bin/bash
set -x
[ -z "$ID_OFFSET" ] && ID_OFFSET=1
export ZOOKEEPER_SERVER_ID=$((${HOSTNAME##*-} + $ID_OFFSET))
export PROJECT=$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace)
export ZOOKEEPER_SERVERS="${APPLICATION_NAME}-0.${APPLICATION_NAME}-headless.${PROJECT}.svc:2888:3888;${APPLICATION_NAME}-1.${APPLICATION_NAME}-headless.${PROJECT}.svc:2888:3888;${APPLICATION_NAME}-2.${APPLICATION_NAME}-headless.${PROJECT}.svc:2888:3888"
/etc/confluent/docker/run
client_port: '2181'
tick_time: '2000'
init_limit: '5'
sync_limit: '2'
kind: ConfigMap
metadata:
labels:
application: ${APPLICATION_NAME}
name: ${APPLICATION_NAME}-config
- apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
labels:
application: ${APPLICATION_NAME}
name: ${APPLICATION_NAME}
spec:
replicas: 3
selector:
matchLabels:
application: ${APPLICATION_NAME}
serviceName: ${APPLICATION_NAME}-headless
template:
metadata:
creationTimestamp: null
labels:
application: ${APPLICATION_NAME}
spec:
containers:
- command: ['/bin/bash', '/etc/scripts/init.sh']
env:
- name: APPLICATION_NAME
value: ${APPLICATION_NAME}
- name: ZOOKEEPER_CLIENT_PORT
valueFrom:
configMapKeyRef:
key: client_port
name: ${APPLICATION_NAME}-config
- name: ZOOKEEPER_TICK_TIME
valueFrom:
configMapKeyRef:
key: tick_time
name: ${APPLICATION_NAME}-config
- name: ZOOKEEPER_INIT_LIMIT
valueFrom:
configMapKeyRef:
key: init_limit
name: ${APPLICATION_NAME}-config
- name: ZOOKEEPER_SYNC_LIMIT
valueFrom:
configMapKeyRef:
key: sync_limit
name: ${APPLICATION_NAME}-config
image: docker-registry.default.svc:5000/confluent/cp-zookeeper:${ZOOKEEPER_VERSION}
imagePullPolicy: Always
name: ${APPLICATION_NAME}
ports:
- containerPort: 2181
name: client
protocol: TCP
- containerPort: 2888
name: server
protocol: TCP
- containerPort: 3888
name: leader-election
protocol: TCP
resources:
requests:
cpu: 256m
memory: 512Mi
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /var/lib/zookeeper/data
name: datadir
- mountPath: /etc/scripts
name: config
volumes:
- name: config
configMap:
name: ${APPLICATION_NAME}-config
dnsPolicy: ClusterFirst
restartPolicy: Always
terminationGracePeriodSeconds: 30
volumeClaimTemplates:
- metadata:
labels:
application: ${APPLICATION_NAME}
name: datadir
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
parameters:
- description: The name for the application.
name: APPLICATION_NAME
required: true
value: zookeeper
- description: Zookeeper ImageStream Tag
name: ZOOKEEPER_VERSION
required: true
value: 4.0.0
oc apply -f template.yml oc new-app --template=zookeeper --param=APPLICATION_NAME=ox-zookeeper --param=ZOOKEEPER_VERSION=4.1.0
Thanks in advance @gAmUssA
I am also facing the same issue with Version 4.1.0(latest). The 4.0.0 version works perfectly.
is the error detected? Thanks!
The error is the myid file is owned by root user. If I run my DeploymentConfig as root it works fine. But this is not good thing to do.
So.... I imagine that the myid file should be generated with other permissions. Am I right? Will it be a fix branch?
I am also facing the same issue. I am not a core member of confluent development team. Let's see if the core development team resolves this issue.
Ouch ok @sagarising . We hope to have news soon, we will see...
+1
We've done a little more digging into this problem.
It seems that since version 4.1.0 a couple of new users (cp-kafka and cp-kafka-connect) are created. Also the directory /var/lib/{COMPONENT}
is created with the user cp-kafka
that belongs to the confluent
group and has permissions 750
.
In previous versions (<4.1.0) this directory belonged to the root
user with 777
permissions.
So when you start these containers in openshift, they are started by a random user belonging to the root group. This scenario causes the container to fail in openshift.
To reproduce the problem you only have to start the containers with a specific random user.
Old ones
docker run -d \
--net=host \
--name=zk-1 \
-e ZOOKEEPER_SERVER_ID=1 \
-e ZOOKEEPER_CLIENT_PORT=2181 \
-e ZOOKEEPER_INIT_LIMIT=5 \
-u 10003 \
confluentinc/cp-zookeeper:4.0.2-1
new images...
docker run -d \
--net=host \
--name=zk-1 \
-e ZOOKEEPER_SERVER_ID=1 \
-e ZOOKEEPER_CLIENT_PORT=2181 \
-e ZOOKEEPER_INIT_LIMIT=5 \
-u 10003 \
confluentinc/cp-zookeeper:4.1.0
The old images work, while the new ones don't.
Evidence: New images:
$ docker run -it --rm confluentinc/cp-zookeeper:4.1.0 /bin/bash
$ root@a4026c13fa79:/# ls -lrta /var/lib/zookeeper/
total 16
drwxr-xr-x 1 root root 4096 Apr 16 22:59 ..
drwxrwxrwx 2 root root 4096 Apr 16 22:59 log
drwxrwxrwx 2 root root 4096 Apr 16 22:59 data
drwxr-x--- 4 cp-kafka confluent 4096 Apr 16 22:59 .
$ root@a4026c13fa79:/# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
systemd-timesync:x:100:103:systemd Time Synchronization,,,:/run/systemd:/bin/false
systemd-network:x:101:104:systemd Network Management,,,:/run/systemd/netif:/bin/false
systemd-resolve:x:102:105:systemd Resolver,,,:/run/systemd/resolve:/bin/false
systemd-bus-proxy:x:103:106:systemd Bus Proxy,,,:/run/systemd:/bin/false
cp-kafka:x:104:108::/var/empty:/bin/false
cp-kafka-connect:x:105:108::/var/empty:/bin/false
root@a4026c13fa79:/#
Old image:
$ docker run -it --rm confluentinc/cp-zookeeper:4.0.2-1 /bin/bash
$ root@63672c5e1f90:/# ls -lrta /var/lib/zookeeper/
total 16
drwxrwxrwx 2 root root 4096 Jul 17 23:21 log
drwxrwxrwx 2 root root 4096 Jul 17 23:21 data
drwxr-xr-x 1 root root 4096 Jul 17 23:21 ..
drwxr-xr-x 4 root root 4096 Jul 17 23:21 .
$ root@63672c5e1f90:/# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
systemd-timesync:x:100:103:systemd Time Synchronization,,,:/run/systemd:/bin/false
systemd-network:x:101:104:systemd Network Management,,,:/run/systemd/netif:/bin/false
systemd-resolve:x:102:105:systemd Resolver,,,:/run/systemd/resolve:/bin/false
systemd-bus-proxy:x:103:106:systemd Bus Proxy,,,:/run/systemd:/bin/false
So.. now... it will change?
@angelbarrera92 @sagarising @kmxillo We have updated the ownership of those directories to be the same as before -- by root -- in 4.1.2. Can you try that and let me know how it goes? Thx.
Hi
We're trying to deploy zookeeper in openshift. Version 4.0.0 works perfectly for us but trying to upgrade to version 4.1.0 fails to start.
Paste the log output:
With version 4.0.0 we have no problems. Is there anything different between these versions that might affect the boot of the containers?
Best regards!