stardog-union / helm-charts

Stardog Helm Charts
Apache License 2.0
9 stars 12 forks source link

Not a cluster #55

Closed MrMunki closed 2 years ago

MrMunki commented 2 years ago

Hi, been trying to configure this using Helm. The clustering does not work. I've re-installed several times now, but it never creates a cluster. The zookeeper instance set themselves up but running the stardog-admin cluster info command responds with "Not Found!"

This is the generated config file:

logging.audit.type=text
pack.enabled=true
pack.zookeeper.address=stardog-zookeeper-0.stardog-zookeeper-headless.stardog-kube:2181,stardog-zookeeper-1.stardog-zookeeper-headless.stardog-kube:2181,stardog-zookeeper-2.stardog-zookeeper-headless.stardog-kube:2181
pack.node.join.retry.count=15
pack.node.join.retry.delay=1m

If I check the logs one of the nodes always says sh: bad number for the zk start node and the output of the log is a single line

+ /opt/stardog/bin/stardog-admin server start --foreground --port 5820 --home /var/opt/stardog/

Is that normal?

Any help appreciated

pdmars commented 2 years ago

I suspect ZooKeeper isn't starting correctly. Stardog Cluster won't be able to start until ZooKeeper is ready and working.

Can you post the output for kubectl -n <your-namespace> get pods as well as kubectl -n <your-namespace> describe pod <zk-pod> for each of the failing ZK pods?

MrMunki commented 2 years ago

I can see zookeeper started and ready, I';ve even deployed zookeeper first then created the stateful set for the stardog instances, but it doesn't work

pdmars commented 2 years ago

Was ZooKeeper deployed as part of the chart or separately? Which version of ZooKeeper are you running? Can you post the charts you're using, any custom values files (except any secrets), and the commands you're running?

MrMunki commented 2 years ago

The chart I'm using is the one in this repo (develop branch, 2.0.2), which is supposed to deploy zookeeper to my understanding but without adding the chart to the top level config I was unable to get it to run. I've added it as a dependency in my top level Chart.yaml and applied the config below

  - name: zookeeper
    version: 5.5.1
    repository: https://charts.bitnami.com/bitnami
    condition: zookeeper.enabled

The values config is below:

# Default values for stardog-deployment.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

stardog:
  service:
    type: LoadBalancer
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-internal: "true"
      service.beta.kubernetes.io/aws-load-balancer-scheme: "internal"  

  persistence:
    storageClass: gp2
    size: 5Gi
  securityContext:
    enabled: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000

  cluster:
  # Start Stardog as a cluster
    enabled: true

  image:
    pullPolicy: IfNotPresent

  replicaCount: 3

zookeeper:
  enabled: true
  replicaCount: 3
  podDisruptionBudget:
    maxUnavailable: 1
  persistence:
    enabled: true
    storageClass: gp2
    size: 5Gi
  resources:
    requests:
      memory: 2Gi
      cpu: 1
  image:
    repository: bitnami/zookeeper
    tag: 3.5.7
    pullPolicy: IfNotPresent
  logLevel: INFO

If I exec into the stardog pods I'm able to nc to the zookeeper addresses and the ruok command of the zookeeper cluster says it's null (as in it's ok)

pdmars commented 2 years ago

You should not have to add it to the Chart.yaml. What's the error you get when you deploy the Stardog chart as-is without any modifications?

MrMunki commented 2 years ago

It just doesn't download when running helm dependency update and doesn't deploy anything (it's like it doesn't exist)

MrMunki commented 2 years ago

I've run this as a top level chart now and it does download zookeeper, but it still does not run as a cluster

pdmars commented 2 years ago

Could you please provide the commands and errors you are getting?

MrMunki commented 2 years ago

I log onto one of the stardog pods

kubectl exec -it -n stardog-kube stardog-stardog-1 -- bash

and I run

bash-4.2$ /opt/stardog/bin/stardog-admin cluster info

Which returns

Not Found!

I don't get any errors, they just don't seem to talk to each other. The zookeeper pods are available via nc on the specified ports from the stardog pods.

What I do see is there is a variable STARDOG_PROPERTIES=/etc/stardog-conf/stardog.properties

but this does not seem to get copied to the home folder, which I think it should?

pdmars commented 2 years ago

Stardog looks for it in the location specified by that environment variable, so it doesn't need to be in home.

What is the output of kubectl logs and kubectl describe for a failing Stardog pod?

MrMunki commented 2 years ago

The pods run, and the I get the output below, they just don't seem to join a cluster, or even attempt to do so. I did try to copy the file to the home folder anyway, but it didn't work anyway.

+ /opt/stardog/bin/stardog-admin server start --foreground --port 5820 --home /var/opt/stardog/
{"instant":{"epochSecond":1631799509,"nanoOfSecond":640000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.virtual.DefaultVirtualGraphRegistry","message":"Initializing virtual graph registry","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}
{"instant":{"epochSecond":1631799509,"nanoOfSecond":734000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.virtual.DefaultVirtualGraphRegistry","message":"Loaded virtual graph registry with 0 entries","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}
{"instant":{"epochSecond":1631799510,"nanoOfSecond":974000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.StardogKernel","message":"Initializing Stardog","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}
************************************************************
This copy of Stardog is licensed to redacted
This is a Enterprise license
This license will expire in 355 days on Tue Sep 06 15:57:28 UTC 2022
************************************************************
                                                             :;   
                                      ;;                   `;`:   
  `'+',    ::                        `++                    `;:`  
 +###++,  ,#+                        `++                    .     
 ##+.,',  '#+                         ++                     +    
,##      ####++  ####+:   ##,++` .###+++   .####+    ####++++#    
`##+     ####+'  ##+#++   ###++``###'+++  `###'+++  ###`,++,:     
 ####+    ##+        ++.  ##:   ###  `++  ###  `++` ##`  ++:      
  ###++,  ##+        ++,  ##`   ##;  `++  ##:   ++; ##,  ++:      
    ;+++  ##+    ####++,  ##`   ##:  `++  ##:   ++' ;##'#++       
     ;++  ##+   ###  ++,  ##`   ##'  `++  ##;   ++:  ####+        
,.   +++  ##+   ##:  ++,  ##`   ###  `++  ###  .++  '#;           
,####++'  +##++ ###+#+++` ##`   :####+++  `####++'  ;####++`      
`####+;    ##++  ###+,++` ##`    ;###:++   `###+;   `###++++      
                                                    ##   `++      
                                                   .##   ;++      
                                                    #####++`      
                                                     `;;;.        
************************************************************
Stardog server 7.7.2 started on Thu Sep 16 13:38:31 UTC 2021.
Stardog server is listening on all network interfaces.
HTTP server available at http://localhost:5820.
STARDOG=/opt/stardog/bin/..
STARDOG_HOME=/var/opt/stardog/
{"instant":{"epochSecond":1631799511,"nanoOfSecond":479000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.cli.impl.ServerStart","message":"Memory options","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}
{"instant":{"epochSecond":1631799511,"nanoOfSecond":479000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.cli.impl.ServerStart","message":"Memory mode: DEFAULT{Starrocks.dict_block_cache=10, Starrocks.block_cache=20, Native.starrocks=70, Heap.dict_value=50, Starrocks.txn_block_cache=5, Heap.dict_index=50, Starrocks.untracked_memory=20, Starrocks.memtable=40, Starrocks.buffer_pool=5, Native.query=30}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}
{"instant":{"epochSecond":1631799511,"nanoOfSecond":480000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.cli.impl.ServerStart","message":"Min Heap Size: 2.0G","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}
{"instant":{"epochSecond":1631799511,"nanoOfSecond":480000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.cli.impl.ServerStart","message":"Max Heap Size: 1.9G","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}
{"instant":{"epochSecond":1631799511,"nanoOfSecond":481000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.cli.impl.ServerStart","message":"Max Direct Mem: 1.0G","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}
{"instant":{"epochSecond":1631799511,"nanoOfSecond":482000000},"thread":"main","level":"INFO","loggerName":"com.complexible.stardog.cli.impl.ServerStart","message":"System Memory: 59G","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}

describe from pod

Namespace:    stardog-kube
Priority:     0
Node:         redacted
Start Time:   Thu, 16 Sep 2021 14:38:08 +0100
Labels:       app=stardog-stardog
              app.kubernetes.io/component=server
              app.kubernetes.io/instance=stardog
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=stardog
              app.kubernetes.io/version=latest
              controller-revision-hash=stardog-stardog-5c4b44bd7c
              helm.sh/chart=stardog-2.0.2
              statefulset.kubernetes.io/pod-name=stardog-stardog-0
Annotations:  kubernetes.io/psp: eks.privileged
Status:       Running
IP:           redacted
IPs:
  IP:           redacted
Controlled By:  StatefulSet/stardog-stardog
Init Containers:
  wait-for-zk:
    Container ID:  docker://9a33d714b2751d1efb8402b8f6e26bb39d6a6a4f37122899e63c61d9304cd2e1
    Image:         busybox
    Image ID:      docker-pullable://busybox@sha256:52f73a0a43a16cf37cd0720c90887ce972fe60ee06a687ee71fb93a7ca601df7
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c

      while :
       do
         echo "Checking for ZK followers"
         ZK_MNTR=$(echo mntr | nc stardog-zookeeper-headless.stardog-kube 2181)
         ZK_FOLLOWERS=$(echo "${ZK_MNTR}" | grep zk_synced_followers | awk '{print $2}')
         echo "Currently ${ZK_FOLLOWERS} ZK followers"
         if [[ "${ZK_FOLLOWERS}" -gt "1" ]]; then
           echo "ZK has two sync'd followers (with the leader that makes 3)"
           exit 0
         fi
         sleep 1
       done

    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 16 Sep 2021 14:38:20 +0100
      Finished:     Thu, 16 Sep 2021 14:38:26 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zrp68 (ro)
Containers:
  stardog-stardog:
    Container ID:  docker://1da71638a8df786814b19bcd19ff6309368a32dff2cc55ed9a99672fdff1af3f
    Image:         stardog/stardog:latest
    Image ID:      docker-pullable://stardog/stardog@sha256:d4b7f776e21902b741e4360c373aca04e03600bb7da483c1937189ba66d5f1e1
    Port:          5820/TCP
    Host Port:     0/TCP
    Command:
      /bin/sh
      -c
      set -ex

      function wait_for_start {
          (
          HOST=${1}
          PORT=${2}
          DELAY=${3}
          # Wait for stardog to be running
          RC=1
          COUNT=0
          set +e
          while [[ ${RC} -ne 0 ]];
          do
            if [[ ${COUNT} -gt ${DELAY} ]]; then
                return 1;
            fi
            COUNT=$(expr 1 + ${COUNT} )
            sleep 1
            curl -v  http://${HOST}:${PORT}/admin/healthcheck
            RC=$?
          done

          return 0
          )
      }

      function change_pw {
          (
          set +ex
          HOST=${1}
          PORT=${2}

          echo "/opt/stardog/bin/stardog-admin --server http://${HOST}:${PORT} user passwd -N xxxxxxxxxxxxxx"
          NEW_PW=$(cat /etc/stardog-password/adminpw)
          /opt/stardog/bin/stardog-admin --server http://${HOST}:${PORT} user passwd -N ${NEW_PW}
          if [[ $? -eq 0 ]];
          then
            echo "Password successfully changed"
            return 0
          else
            curl --fail -u admin:${NEW_PW} http://${HOST}:${PORT}/admin/status
            RC=$?
            if [[ $RC -eq 0 ]];
            then
              echo "Default password was already changed"
              return 0
            elif [[ $RC -eq 22 ]]
            then
              echo "HTTP 4xx error"
              return $RC
            else
              echo "Something else went wrong"
              return $RC
            fi
          fi
          )
      }
      cp -f ${STARDOG_PROPERTIES} ${STARDOG_HOME}
      /opt/stardog/bin/stardog-admin server start --foreground --port ${PORT} --home ${STARDOG_HOME}

    State:          Running
      Started:      Thu, 16 Sep 2021 14:38:27 +0100
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      2
      memory:   4Gi
    Liveness:   http-get http://:server/admin/alive delay=30s timeout=15s period=30s #success=1 #failure=3
    Readiness:  http-get http://:server/admin/healthcheck delay=15s timeout=3s period=5s #success=1 #failure=3
    Environment:
      PORT:                      5820
      STARDOG_HOME:              /var/opt/stardog/
      STARDOG_LICENSE_PATH:      /etc/stardog-license/stardog-license-key.bin
      STARDOG_PROPERTIES:        /etc/stardog-conf/stardog.properties
      STARDOG_SERVER_JAVA_ARGS:  -XX:ActiveProcessorCount=2  -Djava.io.tmpdir=/tmp -Xmx2g -Xms2g -XX:MaxDirectMemorySize=1g
    Mounts:
      /etc/stardog-conf/stardog.properties from stardog-stardog-properties-vol (rw,path="stardog.properties")
      /etc/stardog-license from stardog-license (ro)
      /etc/stardog-password from stardog-stardog-password (ro)
      /var/opt/stardog/ from data (rw)
      /var/opt/stardog/log4j2.xml from stardog-stardog-log4j-vol (rw,path="log4j2.xml")
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zrp68 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-stardog-stardog-0
    ReadOnly:   false
  stardog-license:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  stardog-license
    Optional:    false
  stardog-stardog-properties-vol:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      stardog-stardog-properties
    Optional:  false
  stardog-stardog-log4j-vol:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      stardog-stardog-log4j
    Optional:  false
  stardog-stardog-password:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  stardog-stardog-password
    Optional:    false
  default-token-zrp68:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-zrp68
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason                  Age    From                     Message
  ----    ------                  ----   ----                     -------
  Normal  Scheduled               5m8s   default-scheduler        Successfully assigned stardog-kube/stardog-stardog-0 to redacted
  Normal  SuccessfulAttachVolume  5m2s   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-11b3c3a8-723c-44fd-b26c-2f57ed340f28"
  Normal  Pulling                 4m58s  kubelet                  Pulling image "busybox"
  Normal  Pulled                  4m57s  kubelet                  Successfully pulled image "busybox"
  Normal  Created                 4m56s  kubelet                  Created container wait-for-zk
  Normal  Started                 4m56s  kubelet                  Started container wait-for-zk
  Normal  Pulled                  4m49s  kubelet                  Container image "stardog/stardog:latest" already present on machine
  Normal  Created                 4m49s  kubelet                  Created container stardog-stardog
  Normal  Started                 4m49s  kubelet                  Started container stardog-stardog
pdmars commented 2 years ago

What is in /var/opt/stardog/stardog.log?

Assuming that doesn't provide any clues as to why it's not creating a cluster, can you try with a completely clean slate (no modifications) and installing them directly as a top level chart, just running these: https://github.com/stardog-union/helm-charts#installing

I just ran those against an EKS cluster and it worked fine with the following values file:

tmpDir: /var/opt/stardog/
resources:
  requests:
    cpu: 2 
    memory: 4Gi
persistence:
  storageClass: gp2
  size: 32Gi

javaArgs: "-Xmx2g -Xms2g -XX:MaxDirectMemorySize=1g"

image:
  pullPolicy: Always

securityContext:
  enabled: true
  runAsUser: 1000
  runAsGroup: 1000
  fsGroup: 1000

zookeeper:
  resources:
    requests:
      cpu: 1
      memory: 2Gi
  persistence:
    storageClass: gp2
    size: 10Gi
MrMunki commented 2 years ago

I've destroyed and recreated this more than 5 times, including deleting the volumes, it just seems it doesn't even attempt to create a cluster.

The helm chart runs, everything seems to work, but it does not create a cluster, it just creates 3 individual stardog servers

The only difference between the config you've sent me are the load balancer annotations, the tmp directory and the storage size. Everything else is the same.

Can you see what the output the info command is please?

/opt/stardog/bin/stardog-admin cluster info

cat /var/opt/stardog/stardog.log

INFO  2021-09-16 13:13:49,071 [main] com.complexible.stardog.plan.cache.SimplePlanCache:clear(137): Clearing 0 cached plans
INFO  2021-09-16 13:13:49,172 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(495): Initializing virtual graph registry
INFO  2021-09-16 13:13:49,268 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(519): Loaded virtual graph registry with 0 entries
INFO  2021-09-16 13:13:50,577 [main] com.complexible.stardog.StardogKernel:start(2537): Initializing Stardog
INFO  2021-09-16 13:13:51,091 [main] com.complexible.stardog.cli.impl.ServerStart:call(263): Memory options
INFO  2021-09-16 13:13:51,091 [main] com.complexible.stardog.cli.impl.ServerStart:call(264): Memory mode: DEFAULT{Starrocks.dict_block_cache=10, Starrocks.block_cache=20, Native.starrocks=70, Heap.dict_value=50, Starrocks.txn_block_cache=5, Heap.dict_index=50, Starrocks.untracked_memory=20, Starrocks.memtable=40, Starrocks.buffer_pool=5, Native.query=30}
INFO  2021-09-16 13:13:51,092 [main] com.complexible.stardog.cli.impl.ServerStart:call(265): Min Heap Size: 2.0G
INFO  2021-09-16 13:13:51,092 [main] com.complexible.stardog.cli.impl.ServerStart:call(266): Max Heap Size: 1.9G
INFO  2021-09-16 13:13:51,093 [main] com.complexible.stardog.cli.impl.ServerStart:call(267): Max Direct Mem: 1.0G
INFO  2021-09-16 13:13:51,094 [main] com.complexible.stardog.cli.impl.ServerStart:call(268): System Memory: 59G
INFO  2021-09-16 13:37:39,200 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(495): Initializing virtual graph registry
INFO  2021-09-16 13:37:39,300 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(519): Loaded virtual graph registry with 0 entries
INFO  2021-09-16 13:37:40,588 [main] com.complexible.stardog.StardogKernel:start(2537): Initializing Stardog
INFO  2021-09-16 13:37:41,180 [main] com.complexible.stardog.cli.impl.ServerStart:call(263): Memory options
INFO  2021-09-16 13:37:41,180 [main] com.complexible.stardog.cli.impl.ServerStart:call(264): Memory mode: DEFAULT{Starrocks.dict_block_cache=10, Starrocks.block_cache=20, Native.starrocks=70, Heap.dict_value=50, Starrocks.txn_block_cache=5, Heap.dict_index=50, Starrocks.untracked_memory=20, Starrocks.memtable=40, Starrocks.buffer_pool=5, Native.query=30}
INFO  2021-09-16 13:37:41,181 [main] com.complexible.stardog.cli.impl.ServerStart:call(265): Min Heap Size: 2.0G
INFO  2021-09-16 13:37:41,181 [main] com.complexible.stardog.cli.impl.ServerStart:call(266): Max Heap Size: 1.9G
INFO  2021-09-16 13:37:41,182 [main] com.complexible.stardog.cli.impl.ServerStart:call(267): Max Direct Mem: 1.0G
INFO  2021-09-16 13:37:41,183 [main] com.complexible.stardog.cli.impl.ServerStart:call(268): System Memory: 59G
INFO  2021-09-16 13:55:17,633 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(495): Initializing virtual graph registry
INFO  2021-09-16 13:55:17,732 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(519): Loaded virtual graph registry with 0 entries
INFO  2021-09-16 13:55:19,038 [main] com.complexible.stardog.StardogKernel:start(2537): Initializing Stardog
INFO  2021-09-16 13:55:19,574 [main] com.complexible.stardog.cli.impl.ServerStart:call(263): Memory options
INFO  2021-09-16 13:55:19,575 [main] com.complexible.stardog.cli.impl.ServerStart:call(264): Memory mode: DEFAULT{Starrocks.dict_block_cache=10, Starrocks.block_cache=20, Native.starrocks=70, Heap.dict_value=50, Starrocks.txn_block_cache=5, Heap.dict_index=50, Starrocks.untracked_memory=20, Starrocks.memtable=40, Starrocks.buffer_pool=5, Native.query=30}
INFO  2021-09-16 13:55:19,575 [main] com.complexible.stardog.cli.impl.ServerStart:call(265): Min Heap Size: 2.0G
INFO  2021-09-16 13:55:19,576 [main] com.complexible.stardog.cli.impl.ServerStart:call(266): Max Heap Size: 1.9G
INFO  2021-09-16 13:55:19,576 [main] com.complexible.stardog.cli.impl.ServerStart:call(267): Max Direct Mem: 1.0G
INFO  2021-09-16 13:55:19,577 [main] com.complexible.stardog.cli.impl.ServerStart:call(268): System Memory: 59G
INFO  2021-09-16 14:08:08,257 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(495): Initializing virtual graph registry
INFO  2021-09-16 14:08:08,373 [main] com.complexible.stardog.virtual.DefaultVirtualGraphRegistry:syncCache(519): Loaded virtual graph registry with 0 entries
INFO  2021-09-16 14:08:09,772 [main] com.complexible.stardog.StardogKernel:start(2537): Initializing Stardog
INFO  2021-09-16 14:08:10,333 [main] com.complexible.stardog.cli.impl.ServerStart:call(263): Memory options
INFO  2021-09-16 14:08:10,333 [main] com.complexible.stardog.cli.impl.ServerStart:call(264): Memory mode: DEFAULT{Starrocks.dict_block_cache=10, Starrocks.block_cache=20, Native.starrocks=70, Heap.dict_value=50, Starrocks.txn_block_cache=5, Heap.dict_index=50, Starrocks.untracked_memory=20, Starrocks.memtable=40, Starrocks.buffer_pool=5, Native.query=30}
INFO  2021-09-16 14:08:10,334 [main] com.complexible.stardog.cli.impl.ServerStart:call(265): Min Heap Size: 2.0G
INFO  2021-09-16 14:08:10,335 [main] com.complexible.stardog.cli.impl.ServerStart:call(266): Max Heap Size: 1.9G
INFO  2021-09-16 14:08:10,335 [main] com.complexible.stardog.cli.impl.ServerStart:call(267): Max Direct Mem: 1.0G
INFO  2021-09-16 14:08:10,336 [main] com.complexible.stardog.cli.impl.ServerStart:call(268): System Memory: 59G
pdmars commented 2 years ago

Yeah when I run the install commands I linked to you, with the values file I provided it creates a cluster:

bash-4.2$ /opt/stardog/bin/stardog-admin cluster info
Coordinator:
   10.51.73.95:5820
Nodes:
   10.51.95.196:5820
   10.51.89.74:5820
bash-4.2$ 

Do you have a license that supports clustering? You can check the quantity by running stardog-admin license info /path/to/your/license/file. You'll need a license that can support a 3 node cluster for your values.yaml file with replicatCount: 3.

MrMunki commented 2 years ago

It doesn't mention anything about clustering, but it does say quantity: 1

Issued:     Sun Sep 05 15:57:28 UTC 2021
Expiration: 355 days
Support:    The license does not include maintenance.
Quantity:   1
pdmars commented 2 years ago

Ah, ok, that explains the issue. Apologies we don't have a better warning that is thrown when a user tries to deploy a cluster with more nodes than their license allows. We have an internal issue, PLAT-2697 to fix that. You'll see it in the release notes for Stardog when that is fixed.

For testing you can set replicaCount to 1 and then it should work as a 1 node "cluster" and still use ZooKeeper (vs a single node Stardog deployment, which wouldn't integrate with ZooKeeper at all). Otherwise you can disable Stardog Cluster and ZooKeeper for now and deploy a single node Stardog in k8s with the license you have. I realize that doesn't work for production environments or testing/benchmarking an actual HA deployment.

To obtain a license that supports more nodes you'll need to request one here: https://www.stardog.com/company/contact/

MrMunki commented 2 years ago

Thanks for the help. I have just tried using a single node, as the documentation says start a single node first then add the others to the cluster, but it doesn't create one then either :(. So I guess clustering is out of the question. I didn't read the license limitations, they say "high availability" and it didn't twig until I just re-read it that it meant clustering.

Thanks for the speedy replies though, much appreciated