sitewhere / sitewhere-k8s

SiteWhere / Kubernetes integration including Helm Charts
18 stars 23 forks source link

k8s install sitewhere permission denied #36

Open hdpingshao opened 5 years ago

hdpingshao commented 5 years ago

Describe the bug my kafka、zookeeper and mongodb can not start,give the CrashLoopBackOff. the error is : mkdir: cannot create directory '/var/lib/zookeeper/data': Permission denied could you tell me what can I do? thank you!

Provide Information image

image

hdpingshao commented 5 years ago

the branch is sitewhere-k8s-0.1.4 use the dev-values.yaml

jorgevillaverde-sitewhere commented 5 years ago

Hi @hdpingshao,

Can you provide some extra information so that we have some context of what is going on? Thanks

Describe the bug A clear and concise description of what the bug is.

Provide Information Output of helm version:

Output of kubectl version:

Output of helm ls sitewhere

Cloud Provider/Platform (AKS, GKE, Minikube etc.):

SiteWhere App Version:

SiteWhere Chart Version:

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Additional context Add any other context about the problem here.

hdpingshao commented 5 years ago

Additional context [root@localhost ~]# helm version Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"} Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"} [root@localhost ~]# kubectl version Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:54:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:43:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"} [root@localhost ~]#

the branch is sitewhere-k8s-0.1.4

And I install with the helm:helm install --name sitewhere -f dev-values.yaml --set infra.mosquitto.service.type=NodePort --set services.web_rest.service.http.type=NodePort sitewhere/sitewhere

my k8s is deploy with Binary and the pvc is created Manually

my install steps is: 1、git clone -b sitewhere-k8s-0.1.4 https://github.com/sitewhere/sitewhere-k8s.git 2、cd sitewhere-k8s/charts/sitewhere 3、helm install --name sitewhere -f dev-values.yaml --set infra.mosquitto.service.type=NodePort --set services.web_rest.service.http.type=NodePort sitewhere/sitewhere

thank you!

jorgevillaverde-sitewhere commented 5 years ago

Thanks for the information. I notice that you are using sitewhere-k8s-0.1.4 branch but you are installing from sitewhere repository. Are you using this branch for anything in particular? If you need to install an specific version of sitewhere (ex. 0.1.41), you need to issue the following command:

helm install --name sitewhere \
-f dev-values.yaml \
--set infra.mosquitto.service.type=NodePort \
--set services.web_rest.service.http.type=NodePort \
--version 0.1.4 \
sitewhere/sitewhere

Also, can you provide the output of:

kubectl get pvc -l release=sitewhere

Thanks

hdpingshao commented 5 years ago

Additional context first I want to use the 0.1.4 branch for the test,the other branch or envvironment will take up too many resources and have more errors. then my k8s cluster can not support create pvc dynamic,so I create the pvc manually.

[root@localhost ~]# kubectl get pvc -o wide
NAME                                    STATUS   VOLUME                                  CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-sitewhere-zookeeper-0              Bound    data-sitewhere-zookeeper-0              10Gi       RWX                           47h
data-sitewhere-zookeeper-1              Bound    data-sitewhere-zookeeper-1              10Gi       RWX                           42h
data-sitewhere-zookeeper-2              Bound    data-sitewhere-zookeeper-2              10Gi       RWX                           42h
datadir-sitewhere-consul-0              Bound    datadir-sitewhere-kafka-0               10Gi       RWX                           47h
datadir-sitewhere-consul-1              Bound    datadir-sitewhere-consul-1              10Gi       RWX                           46h
datadir-sitewhere-consul-2              Bound    datadir-sitewhere-consul-2              10Gi       RWX                           46h
datadir-sitewhere-kafka-0               Bound    datadir-sitewhere-mongodb-primary-0     10Gi       RWX                           47h
datadir-sitewhere-kafka-1               Bound    datadir-sitewhere-kafka-2               10Gi       RWX                           42h
datadir-sitewhere-kafka-2               Bound    datadir-sitewhere-kafka-1               10Gi       RWX                           42h
datadir-sitewhere-mongodb-primary-0     Bound    datadir-sitewhere-consul-0              10Gi       RWX                           47h
datadir-sitewhere-mongodb-secondary-0   Bound    datadir-sitewhere-mongodb-secondary-0   10Gi       RWX                           47h
sitewhere-mongodb                       Bound    nfs-pv                                  5Gi        RWX                           40h
[root@localhost ~]# 
hdpingshao commented 5 years ago

Additional context

helm install --name sitewhere -f dev-values.yaml --set infra.mosquitto.service.type=NodePort --set services.web_rest.service.http.type=NodePort --version 0.1.4 sitewhere/sitewhere

And I use the command to start sitewhere return the same error


[root@localhost sitewhere]# kubectl logs pods sitewhere-zookeeper-0
Error from server (NotFound): pods "pods" not found
[root@localhost sitewhere]# kubectl logs sitewhere-zookeeper-0
+ zkGenConfig.sh
Validating environment
ZK_REPLICAS=1
MY_ID=1
ZK_LOG_LEVEL=INFO
ZK_DATA_DIR=/var/lib/zookeeper/data
ZK_DATA_LOG_DIR=/var/lib/zookeeper/log
ZK_LOG_DIR=/var/log/zookeeper
ZK_CLIENT_PORT=2181
ZK_SERVER_PORT=2888
ZK_ELECTION_PORT=3888
ZK_TICK_TIME=2000
ZK_INIT_LIMIT=5
ZK_SYNC_LIMIT=10
ZK_MAX_CLIENT_CNXNS=60
ZK_MIN_SESSION_TIMEOUT=4000
ZK_MAX_SESSION_TIMEOUT=40000
ZK_HEAP_SIZE=1G
ZK_SNAP_RETAIN_COUNT=3
ZK_PURGE_INTERVAL=0
ENSEMBLE
server.1=sitewhere-zookeeper-0.sitewhere-zookeeper-headless.default.svc.cluster.local.:2888:3888
Environment validation successful
Creating ZooKeeper configuration
Wrote ZooKeeper configuration file to /opt/zookeeper/conf/zoo.cfg
Creating ZooKeeper log4j configuration
Wrote log4j configuration to /opt/zookeeper/conf/log4j.properties
Creating ZooKeeper data directories and setting permissions
mkdir: cannot create directory '/var/lib/zookeeper/data': Permission denied
chown: cannot access '/var/lib/zookeeper/data': No such file or directory
mkdir: cannot create directory '/var/lib/zookeeper/log': Permission denied
chown: cannot access '/var/lib/zookeeper/log': No such file or directory
Created ZooKeeper data directories and set permissions in /var/lib/zookeeper/data
/usr/bin/zkGenConfig.sh: line 130: /var/lib/zookeeper/data/myid: No such file or directory
Creating JVM configuration file
Wrote JVM configuration to /opt/zookeeper/conf/java.env
+ exec zkServer.sh start-foreground
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 1099
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to true
Using config: /usr/bin/../etc/zookeeper/zoo.cfg
mkdir: cannot create directory '/var/lib/zookeeper/data': Permission denied
2019-03-29 01:41:08,691 [myid:] - INFO  [main:QuorumPeerConfig@134] - Reading configuration from: /usr/bin/../etc/zookeeper/zoo.cfg
2019-03-29 01:41:08,694 [myid:] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2019-03-29 01:41:08,695 [myid:] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2019-03-29 01:41:08,695 [myid:] - INFO  [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2019-03-29 01:41:08,695 [myid:] - WARN  [main:QuorumPeerMain@113] - Either no config or no quorum defined in config, running  in standalone mode
2019-03-29 01:41:08,696 [myid:] - INFO  [main:QuorumPeerConfig@134] - Reading configuration from: /usr/bin/../etc/zookeeper/zoo.cfg
2019-03-29 01:41:08,696 [myid:] - INFO  [main:ZooKeeperServerMain@96] - Starting server
2019-03-29 01:41:08,701 [myid:] - INFO  [main:Environment@100] - Server environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2019-03-29 01:41:08,702 [myid:] - INFO  [main:Environment@100] - Server environment:host.name=sitewhere-zookeeper-0.sitewhere-zookeeper-headless.default.svc.cluster.local.
2019-03-29 01:41:08,702 [myid:] - INFO  [main:Environment@100] - Server environment:java.version=1.8.0_131
2019-03-29 01:41:08,702 [myid:] - INFO  [main:Environment@100] - Server environment:java.vendor=Oracle Corporation
2019-03-29 01:41:08,702 [myid:] - INFO  [main:Environment@100] - Server environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2019-03-29 01:41:08,702 [myid:] - INFO  [main:Environment@100] - Server environment:java.class.path=/usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper:
2019-03-29 01:41:08,702 [myid:] - INFO  [main:Environment@100] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2019-03-29 01:41:08,702 [myid:] - INFO  [main:Environment@100] - Server environment:java.io.tmpdir=/tmp
2019-03-29 01:41:08,702 [myid:] - INFO  [main:Environment@100] - Server environment:java.compiler=<NA>
2019-03-29 01:41:08,703 [myid:] - INFO  [main:Environment@100] - Server environment:os.name=Linux
2019-03-29 01:41:08,703 [myid:] - INFO  [main:Environment@100] - Server environment:os.arch=amd64
2019-03-29 01:41:08,703 [myid:] - INFO  [main:Environment@100] - Server environment:os.version=3.10.0-862.el7.x86_64
2019-03-29 01:41:08,703 [myid:] - INFO  [main:Environment@100] - Server environment:user.name=zookeeper
2019-03-29 01:41:08,703 [myid:] - INFO  [main:Environment@100] - Server environment:user.home=/home/zookeeper
2019-03-29 01:41:08,703 [myid:] - INFO  [main:Environment@100] - Server environment:user.dir=/
2019-03-29 01:41:08,715 [myid:] - ERROR [main:ZooKeeperServerMain@64] - Unexpected exception, exiting abnormally
java.io.IOException: Unable to create data directory /var/lib/zookeeper/log/version-2
        at org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:85)
        at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:110)
        at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:87)
        at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:53)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)```
jorgevillaverde-sitewhere commented 5 years ago

Hi @hdpingshao, what's your Cloud Provider/Platform?

Is it minikube? If so I found this issue and this other one related to Kafka and minikube.

Please let me know, I'll keep digging into this issue.

hdpingshao commented 5 years ago

Additional context I manually deployed the kubernetes cluster in binary mode using a local physical server。 And I used the following documentation to deploy the kubernetes cluster: https://github.com/hdpingshao/ops/blob/master/kubernetes/docs/%E4%BA%8C%E3%80%81k8s%E7%94%9F%E4%BA%A7%E7%BA%A7%E9%9B%86%E7%BE%A4%E9%83%A8%E7%BD%B2.md

[root@localhost ~]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
etcd-0               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"}   
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
[root@localhost ~]# kubectl get nodes
NAME              STATUS   ROLES    AGE    VERSION
192.168.200.117   Ready    <none>   128d   v1.12.2
192.168.200.118   Ready    <none>   128d   v1.12.2
192.168.200.128   Ready    <none>   85d    v1.12.2
192.168.200.129   Ready    <none>   85d    v1.12.2
192.168.200.133   Ready    <none>   85d    v1.12.2
192.168.200.136   Ready    <none>   85d    v1.12.2
[root@localhost ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:54:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:43:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
jorgevillaverde-sitewhere commented 5 years ago

There seems to be a problem with the PVC that Zookeeper and Mongo are using. For some reason Zk and Mongo do not have permissions over their Persisten Volume Claim. What's the Storage Class you are using? Can you check that the pods that are mounting these PVC have RW permissions?

hdpingshao commented 5 years ago

Additional context

I created the PVC manually through the yaml file,for example:

[root@localhost pvc]# cat pvc-one.yaml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-sitewhere-zookeeper-0
  namespace: default
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
[root@localhost pvc]# kubectl describe pvc data-sitewhere-zookeeper-0
Name:          data-sitewhere-zookeeper-0
Namespace:     default
StorageClass:  
Status:        Bound
Volume:        data-sitewhere-zookeeper-0
Labels:        <none>
Annotations:   kubectl.kubernetes.io/last-applied-configuration:
                 {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"data-sitewhere-zookeeper-0","namespace":"default"},...
               pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      10Gi
Access Modes:  RWX
Events:        <none>
Mounted By:    sitewhere-zookeeper-0
[root@localhost pvc]# kubectl describe pv data-sitewhere-zookeeper-0
Name:            data-sitewhere-zookeeper-0
Labels:          <none>
Annotations:     kubectl.kubernetes.io/last-applied-configuration:
                   {"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"data-sitewhere-zookeeper-0"},"spec":{"accessModes":["Rea...
                 pv.kubernetes.io/bound-by-controller: yes
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    
Status:          Bound
Claim:           default/data-sitewhere-zookeeper-0
Reclaim Policy:  Retain
Access Modes:    RWX
Capacity:        10Gi
Node Affinity:   <none>
Message:         
Source:
    Type:           Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime)
    EndpointsName:  glusterfs-cluster
    Path:           data-sitewhere-zookeeper-0
    ReadOnly:       false
Events:             <none>