Unclean shutdown of mariadb-galera clusters leaves pod in CrashLoopBackOff state

ganchandrasekaran commented 4 years ago

Which chart:

chart name: bitnami/mariadb-galera
version: 1.0.3

Describe the bug

Unclean shutdown of mariadb-galera clusters leaves pod in CrashLoopBackOff state. This can be achieved by doing a docker restart. The pod that was shut down last will be marked as “Safe-to-Bootstrap”. All the other pods will be marked as unsafe to bootstrap from. After docker restart when k8s tries to start the cluster, Galera refuses to start the first node that was marked as unsafe to bootstrap from. So I see the following Error message in the logs:

2020-04-08  0:04:03 0 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
2020-04-08  0:04:03 0 [ERROR] WSREP: wsrep::connect(gcomm://) failed: 7
2020-04-08  0:04:03 0 [ERROR] Aborting

In case of unclean shutdown or hard crash, all nodes will have “safe_to_bootstrap: 0”,

Expected behavior

All mariaDB pods should start normally. Provide some easy way or automated way to edit the file /var/lib/mysql/grastate.dat, and change safe_to_bootstrap to 1.

Version of Helm and Kubernetes:

Output of helm version:

Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:36:53Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:27:17Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}

juan131 commented 4 years ago

Hi @ganchandrasekaran

Thanks so much for reporting this issue!!

I will open an internal task to investigate what we can do to reduce the risk of unclean shutdown of MariaDB Galera, and what mechanisms to implement in case all the replicas are marked as unsafe to bootstrap from.

If you're using a multi-node K8s cluster, play with Affinity/Antiaffinity (see affinity parameter) to ensure your pods are scheduled on different nodes to reduce the risk of every MariaDB Galera replica to be shutdown if a node is drained or there's hardware issue.

Note:

When using the Bitnami MariaDB Galera container image. The grastate.dat file is not at /var/lib/mysql/grastate.dat but /bitnami/mariadb/data/grastate.dat instead. See:

$ cat /bitnami/mariadb/data/grastate.dat
# GALERA saved state
version: 2.1
uuid:    b96914f0-7a38-11ea-ae0a-43fbea0090cc
seqno:   -1
safe_to_bootstrap: 0

stale[bot] commented 4 years ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

andrew-landsverk-win commented 4 years ago

I worked around this by setting the replicaSize down to 1 ( my cluster wouldn't start at all anyway ). I then added an extraInitContainer which just did a sed into that file to change the value to 1. Once that node was back up, remove the extraInitContainer and set replicaSize back to what you had it. I'd like to also see a better fix for this though :)

juan131 commented 4 years ago

Hi @andrew-landsverk-win

Thanks for sharing your solution! We didn't have time to work on the associated task to this issue but I guess the are two options to address this:

a) Use extra init container or sidecar containers (the option you mentioned)
b) Using preStop hooks to ensure one of the pods is marked as “Safe-to-Bootstrap” when it's getting stopped and it detects there are no other nodes

nevets963 commented 4 years ago

Hey @andrew-landsverk-win @juan131

We're also experiencing this exact same issue which we trigger by an unexpected shutdown of the node running a given pod (pod-0 in the example). My workaround is the basically the same as described by @andrew-landsverk-win.

export POD_SPEC=$'
{
    "apiVersion": "v1",
    "kind": "Pod",
    "metadata": {
        "name": "galera-debug"
    },
    "spec": {
        "containers": [{
            "command": [
                "sed",
                "-i",
                "s/safe_to_bootstrap: 0/safe_to_bootstrap: 1/",
                "/mnt/data/grastate.dat"
            ],
            "image": "bitnami/minideb",
            "name": "mycontainer",
            "volumeMounts": [{
                "mountPath": "/mnt",
                "name": "galeradata"
            }]
        }],
        "restartPolicy": "Never",
        "volumes": [{
            "name": "galeradata",
            "persistentVolumeClaim": {
                "claimName": "data-mariadb-galera-cst29-0"
            }
        }]
    }
}'

kubectl run -it --rm --tty galera-debug --overrides=$POD_SPEC --image="bitnami/minideb" --namespace=mariadb-galera-gvb22

This then brought back the pod, and after it's 'Running' again, I did the inverse of the 'sed' command in the Pod spec defined above to return 'safe_to_bootstrap' back to 0. I don't fully understand the implications of doing this. Maybe someone can explain this better here?

If the above is a credible solution, then I'm happy to work on a patch and submit a PR.

juan131 commented 4 years ago

@rafariossaa I think you have more context about this. Could you please take a look?

rafariossaa commented 4 years ago

Hi, Sorry for the delay. You can force safe_to_bootstrap by using galera.bootstrap.forceSafeToBootstra, please take a look to this section in the readme. Could you give it a try and see if this is what you need ?

nevets963 commented 4 years ago

Hi @rafariossaa

We saw this section but is for bootstrapping nodes other than 0. What happens if it is 0 that is stuck in the crashloopbackoff?

rafariossaa commented 4 years ago

Hi, You can indicate the node number in galera.bootstrap.bootstrapFromNode. Which is the error that is keeping the pod in crash loop ?

jfillatre commented 4 years ago

Hi, I'm also experiencing the same issue on a 3 replicas stafulset galera cluster, and using 10.5.5 Chart with docker.io/bitnami/mariadb-galera:10.4.14-debian-10-r8 image. After a cluster unclean stop , all nodes have a safe_to_bootstrap: 0. As indicate I use this procedure in order to try to bootstrap from the second statefulset node (mariadb-galera-1 in my case).

It appears that the first node still want to bootstrap but it fail due to safe to bootstrap check:

$ kubectl logs mariadb-galera-0 mariadb-galera 
mariadb 12:06:50.87 
mariadb 12:06:50.87 Welcome to the Bitnami mariadb-galera container
mariadb 12:06:50.87 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-mariadb-galera
mariadb 12:06:50.88 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-mariadb-galera/issues
mariadb 12:06:50.88 
mariadb 12:06:50.88 INFO  ==> ** Starting MariaDB setup **
mariadb 12:06:50.90 INFO  ==> Validating settings in MYSQL_*/MARIADB_* env vars
mariadb 12:06:50.94 INFO  ==> Initializing mariadb database
mariadb 12:06:50.96 WARN  ==> The mariadb configuration file '/opt/bitnami/mariadb/conf/my.cnf' is not writable or does not exist. Configurations based on environment variables will not be applied for this file.
mariadb 12:06:50.96 INFO  ==> Persisted data detected. Restoring
mariadb 12:06:50.98 ERROR ==> It is not safe to bootstrap form this node ('safe_to_bootstrap=0' is set in 'grastate.dat'). If you want to force bootstrap, set the environment variable MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes

To confirm I've also set as extraEnvVars the MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP to yes and that lead to a split brain. Indeed both mariadb-galera-0 and mariadb-galera-1 have bootrapped:

mariadb-galera-0


MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 1     |
+--------------------+-------+
1 row in set (0.001 sec)

MariaDB [(none)]>


- mariadb-galera-1

MariaDB [(none)]>


- mariadb-galera-2

MariaDB [(none)]>



I don't no what exactly goes wrong and lead to always bootstrap on node 0, but I've a first question: 
- How is persisted the  DB_GALERA_BOOTSTRAP_FILE after pod respawning?

Regards

rafariossaa commented 4 years ago

Hi, To force the bootstrap, you need to set galera.bootstrap.forceSafeToBootstrap=true. Could you give this a try ?

jfillatre commented 4 years ago

Yes, Already done as you can see in resulting statefuleset pod template below:

    spec:
      containers:
      - command:
        - bash
        - -ec
        - |
          # Bootstrap from the indicated node
          NODE_ID="${MY_POD_NAME#"mariadb-galera-"}"
          if [[ "$NODE_ID" -eq "1" ]]; then
              export MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes
              export MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes
          fi
          exec /opt/bitnami/scripts/mariadb-galera/entrypoint.sh /opt/bitnami/scripts/mariadb-galera/run.sh

jfillatre commented 4 years ago

@rafariossaa Do you need more details about scenario or my testing environment?

rafariossaa commented 4 years ago

Hi, Yes, thanks. I will like to setup a similar environment and follow the same steps than you to reproduce this issue. I will need the commands and chart settings you are using to start the cluster and how you are producing the unclean stop. Are you killing the pods, or are you stopping the chart ?.

jfillatre commented 4 years ago

Of course. I install chart using helm install mariadb-galera bitnami/mariadb-galera --version 4.3.3 -f statefulServices/mariadb-galera/mariadb-galera-config-values.yaml command. Configuration is in attachment. mariadb-galera-config-values.zip

A statefulset scale down/scale up work. To produce the outage, I simply run following:

kubectl delete po -l app.kubernetes.io/name=mariadb-galera
pod "mariadb-galera-0" deleted
pod "mariadb-galera-1" deleted
pod "mariadb-galera-2" deleted

Regards

marcosbc commented 4 years ago

Hi @jfillatre, I was able to reproduce the outage by scaling down. However, my understanding is that this is expected when you stop all nodes at once and start again, please @rafariossaa confirm.

In any case I was able to fix it like instructed in this thread via forceSafeToBootstrap and bootstrapFromNode:

$ helm upgrade mygalera \
    --set galera.mariabackup.password=mymariabackuppassword \
    --set rootUser.password=myrootpassword \
    --set db.password=mydbpassword \
    --set galera.bootstrap.forceSafeToBootstrap=true \
    --set galera.bootstrap.bootstrapFromNode=0 .

Could you try again and let us know if there's anything we're missing?

jfillatre commented 4 years ago

Indeed I can force bootstrap from node 0 following instructions. But what about when the highest seqno identified is on the node 1 and want to bootstrap from (my initial use case)?

On my side it doesn't work and statefulset respawn stop on node 0 with ERROR ==> It is not safe to bootstrap form this node ('safe_to_bootstrap=0' is set in 'grastate.dat')

miguelaeh commented 4 years ago

Hi @jfillatre , If all your nodes have the 'safe_to_bootstrap=0' you should check the following section: https://github.com/bitnami/charts/tree/master/bitnami/mariadb-galera#all-the-nodes-with-safe_to_bootstrap-0 As Marcos said you will need to set galera.bootstrap.forceSafeToBootstrap=true. Did you try setting that in the upgrade command? In any case, @rafariossaa will be able to provide a better answer to this. Regards.

jfillatre commented 4 years ago

I, @miguelaeh . Indeed I've already review and play scenario from https://github.com/bitnami/charts/tree/master/bitnami/mariadb-galera#all-the-nodes-with-safe_to_bootstrap-0, but I'm not able to bootstrap from second node of statefulset that own the latest seqno. I don't know where is the mistake...

rafariossaa commented 4 years ago

Hi, @jfillatre Thanks for noticing and reporting this. I was able to reproduce the issue, it is not working when using --set galera.bootstrap.bootstrapFromNode=N and N != 0. I am opening an internal task to look into this and fix it. We will be back as soon as we have news.

jfillatre commented 4 years ago

@rafariossaa: Great! Aside, have you already heard about a project to automate this recovery, may be using operator pattern to elect the node with the highest seqno?

nevets963 commented 4 years ago

@jfillatre @rafariossaa We're also looking for the same thing. I checked operatorhub.io but didn't find a galera operator. An operator would be the best way to go here for this. However, in the meantime, there might be something more simple we can develop? Maybe just an init container, that can bootstrap from a node with the highest seqno if the pod goes bad... ?

rafariossaa commented 4 years ago

Hi, I agree, using an operator should be the tool to deal with this kind of issues.
Unfortunately, right now we are not developing operators.

jfillatre commented 4 years ago

Hi @rafariossaa ,

Have you any updates about internal tasks state?
Is it a point to engage a tasks to add a feature allowing:
- Automated restart of cluster when the node 0 is node the "safe to bootstrap" one?
- Automated bootstrap node election when all node are not safe to bootstrap (with optional activation)?

I've not analyze if it require a controler or if it can be done only with an additionnal sidecar...

Regard

marcosbc commented 4 years ago

Hi @jfillatre,

Have you any updates about internal tasks state?

Unfortunately we have not yet started to look into this. Currently we have a very limited capacity, so I'm unable to give an estimation of when we think this could be fixed.

Is it a point to engage a tasks to add a feature allowing:

Automated restart of cluster when the node 0 is node the "safe to bootstrap" one?

Automated bootstrap node election when all node are not safe to bootstrap (with optional activation)?

Unfortunately I don't think it would be feasible. Note that those items would require an additional service to perform those checks, meaning another container if we want to follow the 1 container-1 process rule.

If you have a better suggestion on how to implement such feature and want to contribute, we'd encourage you to do so by sending a PR. We're glad to accept external contributions, and we try to review them as quickly as we can. That way you could also get this feature implemented quicker than by conventional means.

ghost commented 4 years ago

This is such a severe problem, IMO, its irresponsible to even have this chart published without it being fixed. Its actually impossible to bootstrap the cluster with this chart if anything but node zero is the latest node, because Kubernetes won't start nodes until the prior one starts, and the prior ones won't start. People who have experience with Galera may not have as much experience with Kubernetes and may not recognize that the node management in this chart is fundamentally broken, until they end up with an unrecoverable cluster as a result.

The process of tagging the deployment with a setting coming from Helm to indicate the node to bootstrap from is also fundamentally flawed, because if that node ever restarts, it'll re-bootstrap the cluster from that node a second time. Best case you split-brain the cluster, worst case it causes the other nodes to fail and restart, and data gets lost as they flush their state and so a full state transfer from the node.

So a fix for this really need to address both -- you need a mechanism to bootstrap a failed cluster from an arbitrary node and that mechanism needs to be single-shot, so it can't possibly happen twice. Additionally, the chart must be fixed where it properly shuts down the cluster -- contrary to the Readme, setting replicas to zero will not gracefully stop the cluster, putting you into this state. Scaling to one will at least ensure node zero is the most advanced node, but you still have to do an unclean bootstrap, and still have the issue that any restart of that node in the future will corrupt the database.

IMO, the best way to handle this is to move the bootstrap configuration into a configmap. Put the bootstrap sequence number, rather than node number, in that configmap. At startup, check for that setting, and if its there, compare it to the seqno in grastate.dat. If they match, bootstrap the cluster from that node. If not, just start normally. That guarantees the cluster can never bootstrap to the wrong node because the next delta will increment the seqno, so on a restart no nodes would find it.

Simply changing the podManagementPolicy to Parallel would then start all the nodes simultaneously on a recreated cluster. The "bad" ones will end up restarting, but the "good" one will come up and the looped restarts will then sync from it. This prevents the need to make a fundamental (and persistent) change to the chart configuration to recover a cluster.

It'd also make it pretty easy to create a script or utility pod that can bootstrap an existing cluster by mounting all the pvc, getting the highest seqno from all the grastate.dat files, and setting the value into the configmap. Basically set replicas to zero, run the pod to find and save the most advanced seqno, then set the replica count back.

Without those changes, IMO, this chart is just unsafe for production use.

rafariossaa commented 4 years ago

Hi, I would like to add some clarification to this issue. This misbehavior was fixed, but I found that in the README that the detail about starting in Parallel is missing. I will fix that. I think the current approach is quite similar to what you propose. The indicate node is the one that is going to start first and the rest will sync with this one, and only the indicated one is going to start first. On the "stopping cleanly" side, when you do helm delete kubernetes will start deleting the resources (pods, sts, etc) in the order kubernetes estimate it is best, so the only way to guarantee there is a node safe to bootstrap is making it the only node in the cluster by scaling it to 1 node.

This is what I tried to verify the starting from other node, could you give it a try ?

To produce the situation:

Launch:

$ helm repo update
$ helm install mygalera bitnami/mariadb-galera --set rootUser.password=mypwd --set galera.mariabackup.password=otherpwd

Create data:

$ kubectl run mygalera-mariadb-galera-client --rm --tty -i --restart='Never' --namespace rafael --image docker.io/bitnami/mariadb-galera:10.5.6-debian-10-r7 --command \
  -- mysql -h mygalera-mariadb-galera -P 3306 -uroot -p$(kubectl get secret --namespace rafael mygalera-mariadb-galera -o jsonpath="{.data.mariadb-root-password}" | base64 --decode) my_database
MariaDB [my_database]> create table t1 (c1 varchar(10));
MariaDB [my_database]> insert into t1 values ('qwer');
MariaDB [my_database]> insert into t1 values ('qwer');
MariaDB [my_database]> insert into t1 values ('qwer');
MariaDB [my_database]> select count(*) from t1;
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0.001 sec)

Stop the cluster making node 0 not to be the last one to stop:
```
$ watch -n 1 kubectl delete pod mygalera-mariadb-galera-0
```
When node 0 is stopped, run in other terminal:
```
$ helm delete mygalera
```
Checking safe to bootstrap and cluster state:

(Change XXXX by the node number)

kubectl run --generator=run-pod/v1 -i --rm --tty volpod --overrides='
{
    "apiVersion": "v1",
    "kind": "Pod",
    "metadata": {
        "name": "volpod"
    },
    "spec": {
        "containers": [{
            "command": [
                "cat",
                "/mnt/data/grastate.dat"
            ],
            "image": "bitnami/minideb",
            "name": "mycontainer",
            "volumeMounts": [{
                "mountPath": "/mnt",
                "name": "galeradata"
            }]
        }],
        "restartPolicy": "Never",
        "volumes": [{
            "name": "galeradata",
            "persistentVolumeClaim": {
                "claimName": "data-mygalera-mariadb-galera-XXXX"
            }
        }]
    }
}' --image="bitnami/minideb"

From node 0:

# GALERA saved state
version: 2.1
uuid:    9ae8a441-1798-11eb-a271-87899951190c
seqno:   19
safe_to_bootstrap: 0
pod "volpod" deleted

From node 1:

# GALERA saved state
version: 2.1
uuid:    9ae8a441-1798-11eb-a271-87899951190c
seqno:   20
safe_to_bootstrap: 0
pod "volpod" deleted

From node 2:

# GALERA saved state
version: 2.1
uuid:    9ae8a441-1798-11eb-a271-87899951190c
seqno:   20
safe_to_bootstrap: 0
pod "volpod" deleted

In this case node 0 is older, it has a lower seqno, and there is not a node safe to bootstrap from.

Restart: As there is not a safe to bootstrap node, let's pick node 2 to bootstrap from.

$ helm install mygalera bitnami/mariadb-galera \
    --set rootUser.password=mypwd \
    --set galera.mariabackup.password=otherpwd \
    --set galera.bootstrap.bootstrapFromNode=2 \
    --set galera.bootstrap.forceSafeToBootstrap=true \
    --set podManagementPolicy=Parallel

When starting, if you monitor the pods status:

$ watch -n 1 kubectl get pods

You will notice that node '0' and '1', will be in crashloop until node '2' is running.

Jacq commented 4 years ago

@jfillatre @rafariossaa We're also looking for the same thing. I checked operatorhub.io but didn't find a galera operator. An operator would be the best way to go here for this. However, in the meantime, there might be something more simple we can develop? Maybe just an init container, that can bootstrap from a node with the highest seqno if the pod goes bad... ?

Hi, have you tried this operator https://github.com/Orange-OpenSource/galera-operator?

ghost commented 4 years ago

We ended up just dumping the bitnami chart and switching to Percona's XtraDB operator. Its not MariaDB/Galera but rather MySQL/Galera, but it does a lot of things the bitnami one doesn't do -- it has init containers to properly start up the cluster, proper finalizers to clean it up properly, reconfigurations just work, it has integrated backup support, and it has integrated management for haproxy or proxysql, which gets around another issue we ran into with this chart -- it uses the Kubernetes service proxy, which ends up having writes distributed across the nodes, which is fairly universally recommended against.

The only "downside" with it is XtraDB is generally pretty strict about preventing scenarios that can trip up Galera. That's good for reliability, but can cause things that were not quite right to stop working.

rafariossaa commented 4 years ago

Hi, Thanks for the suggestions, we will take a look and see if we can bring improvements to the galera chart.

dangelm commented 3 years ago

Hi, Same issue here. each 1-3 days 2 out of 3 nodes of Maria stop working. usually node 0 is ok and node 1 and 2 are in crashloopback forever. this is very serious because it makes Mariadb not production ready on k8s.

Logs from node 0:

>   4bad400e-9ab6,0
    4c8671cc-8326,0
    4f734ff3-a4c6,0
    5209c753-a5ce,0
    5259e4fc-8e5e
2021-01-05  8:08:53 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2021-01-05  8:08:53 0 [Note] WSREP: view(view_id(NON_PRIM,7d2d53b3-b992,709) memb {
    7d2d53b3-b992,0
} joined {
} left {
} partitioned {
    00c720ba-a8fa,0

Logs from node 1:

> =================================================
2021-01-05  7:59:46 1 [Note] WSREP: Non-primary view
2021-01-05  7:59:46 1 [Note] WSREP: Server status change connected -> connected
2021-01-05  7:59:46 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-01-05  7:59:46 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-01-05  7:59:47 0 [Note] WSREP: (5259e4fc-8f04, 'tcp://0.0.0.0:4567') reconnecting to 971f4f1f-b1d3 (tcp://100.101.220.70:4567), attempt 90
2021-01-05  7:59:51 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
2021-01-05  7:59:51 0 [Note] WSREP: view(view_id(NON_PRIM,5259e4fc-8f04,47) memb {
    5259e4fc-8f04,0
} joined {
} left {
} partitioned {
})
2021-01-05  7:59:51 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2021-01-05  7:59:51 0 [Note] WSREP: Flow-control interval: [16, 16]
2021-01-05  7:59:51 0 [Note] WSREP: Received NON-PRIMARY.
2021-01-05  7:59:51 1 [Note] WSREP: ================================================
View:
  id: 00000000-0000-0000-0000-000000000000:-1
  status: non-primary
  protocol_version: -1
  capabilities: 
  final: no
  own_index: 0
  members(1):
    0: 5259e4fc-4e78-11eb-8f04-d6c940cb9df8, service-mariadb-galera-1
=================================================

Logs from node 2:

> mariadb 07:51:16.14 INFO  ==> ** Starting MariaDB **
mariadb 07:51:16.14 INFO  ==> Setting previous boot
2021-01-05  7:51:16 0 [Note] /opt/bitnami/mariadb/sbin/mysqld (mysqld 10.5.5-MariaDB-log) starting as process 1 ...
2021-01-05  7:51:16 0 [Note] WSREP: Loading provider /opt/bitnami/mariadb/lib/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2021-01-05  7:51:16 0 [Note] WSREP: wsrep_load(): loading provider library '/opt/bitnami/mariadb/lib/libgalera_smm.so'
2021-01-05  7:51:16 0 [Note] WSREP: wsrep_load(): Galera 4.5(r0) by Codership Oy <info@codership.com> loaded successfully.
2021-01-05  7:51:16 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2021-01-05  7:51:16 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0
2021-01-05  7:51:16 0 [Note] WSREP: GCache DEBUG: opened preamble:
Version: 2
UUID: b909a5fd-4db1-11eb-a7d9-d21455ebcdcc
Seqno: -1 - -1
Offset: -1
Synced: 1
2021-01-05  7:51:16 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: b909a5fd-4db1-11eb-a7d9-d21455ebcdcc, offset: -1
2021-01-05  7:51:16 0 [Note] WSREP: GCache::RingBuffer initial scan...  0.0% (        0/134217752 bytes) complete.
2021-01-05  7:51:16 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217752/134217752 bytes) complete.
2021-01-05  7:51:16 0 [Note] WSREP: Recovering GCache ring buffer: didn't recover any events.
2021-01-05  7:51:16 0 [Note] WSREP: Passing config to GCS: base_dir = /bitnami/mariadb/data/; base_host = x.x.x.70; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /bitnami/mariadb/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S
2021-01-05  7:51:16 0 [Note] WSREP: Start replication
2021-01-05  7:51:16 0 [Note] WSREP: Connecting with bootstrap option: 0
2021-01-05  7:51:16 0 [Note] WSREP: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2021-01-05  7:51:16 0 [Note] WSREP: protonet asio version 0
2021-01-05  7:51:16 0 [Note] WSREP: Using CRC-32C for message checksums.
2021-01-05  7:51:16 0 [Note] WSREP: backend: asio
2021-01-05  7:51:16 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2021-01-05  7:51:16 0 [Warning] WSREP: access file(/bitnami/mariadb/data//gvwstate.dat) failed(No such file or directory)
2021-01-05  7:51:16 0 [Note] WSREP: restore pc from disk failed
2021-01-05  7:51:16 0 [Note] WSREP: GMCast version 0
2021-01-05  7:51:16 0 [Note] WSREP: (ca0b9786-b1c8, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2021-01-05  7:51:16 0 [Note] WSREP: (ca0b9786-b1c8, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2021-01-05  7:51:16 0 [Note] WSREP: EVS version 1
2021-01-05  7:51:16 0 [Note] WSREP: gcomm: connecting to group 'galera', peer 'service-mariadb-galera-headless.collection.svc.cluster.local:'
2021-01-05  7:51:16 0 [Note] WSREP: (ca0b9786-b1c8, 'tcp://0.0.0.0:4567') connection established to 7d2d53b3-b992 tcp://x.x.x.2:4567
2021-01-05  7:51:16 0 [Note] WSREP: (ca0b9786-b1c8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://x.x.3.3:4567 
2021-01-05  7:51:16 0 [Note] WSREP: (ca0b9786-b1c8, 'tcp://0.0.0.0:4567') connection established to 5259e4fc-8f03 tcp://x.x.x.3:4567
2021-01-05  7:51:16 0 [Note] WSREP: evs::proto(ca0b9786-b1c8, GATHER, view_id(TRANS,ca0b9786-b1c8,0)) temporarily discarding known 5259e4fc-8f03 due to received install message
2021-01-05  7:51:16 0 [Note] WSREP: EVS version upgrade 0 -> 1
2021-01-05  7:51:16 0 [Note] WSREP: declaring 7d2d53b3-b992 at tcp://x.x.x.12:4567 stable
2021-01-05  7:51:16 0 [Note] WSREP: PC protocol upgrade 0 -> 1
.
.
    4f734ff3-a4c6,0
    5209c753-a5ce,0
    5259e4fc-8e5e
2021-01-05  7:51:47 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
     at gcomm/src/pc.cpp:connect():160
2021-01-05  7:51:47 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():220: Failed to open backend connection: -110 (Connection timed out)
2021-01-05  7:51:47 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1632: Failed to open channel 'galera' at 'gcomm://service-mariadb-galera-headless.collection.svc.cluster.local': -110 (Connection timed out)
2021-01-05  7:51:47 0 [ERROR] WSREP: gcs connect failed: Connection timed out
2021-01-05  7:51:47 0 [ERROR] WSREP: wsrep::connect(gcomm://service-mariadb-galera-headless.collection.svc.cluster.local) failed: 7
2021-01-05  7:51:47 0 [ERROR] Aborting
Warning: Memory not freed: 48

we currently use this config:

  galera:
    bootstrap:
      forceSafeToBootstrap: true
      bootstrapFromNode: null

Do we need to set podManagementPolicy to Parallel? or it wont help? if yes, need to set it with bootstrapFromNode param or leave it null?

rafariossaa commented 3 years ago

Hi @dangelm , We will continue with your issue in #4929 .

rafariossaa commented 3 years ago

Hi @jfillatre , @nevets963, @ganchandrasekaran I would like to know if you where able to solve your issue with the indications provided in this thread. @dangelm's issue is going to be handled in #4929.

ganchandrasekaran commented 3 years ago

@rafariossaa Thank you very much for your time. I can verify that your earlier post is detailed and works as you suggested. However, if only there is a way to make it work in a more automated fashion it would be very useful. For example when all pods are in crashloop:

1, Try to resolve it without needing to delete the helm release and install again with bootstrapFromNode flag. 2, Avoid manually identifying the bootstrapFromNode number. Automate identification of large seqno during install time.

Thanks again.

rafariossaa commented 3 years ago

Hi, Thanks for your suggestions. We are constantly evolving our charts and I will add these to see what we can do future releases. However, this kind of handling I think it won't be possible without the use of operators.

I am closing this issue. If someone find other related issue, please don't hesitate to reopen this one or create a new one and reference this one.

voarsh2 commented 3 years ago

I used the Galera helm app on my Kubernetes, imported a database, no restarts and randomly one of them presents (I think galera 0) gives me: (1047, 'WSREP has not yet prepared node for application use')

I've just turned it off and get:

 It is not safe to bootstrap form this node ('safe_to_bootstrap=0' is set in 'grastate.dat'). If you want to force bootstrap, set the environment variable MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes

I can't seem to extract from these comments how I get it working again.

voarsh2 commented 3 years ago

You can force safe_to_bootstrap by using galera.bootstrap.forceSafeToBootstra, please take a look to this section in the readme.

Where does this go? Taint? environment variables?

rafariossaa commented 3 years ago

Hi, You can set them or in the values.yaml and run helm install mygalera -f values.yaml . or by using helm install mygalera ... --set galera.bootstrap.forceSafeToBootstrap=yes. You can see more examples here.

voarsh2 commented 3 years ago

Hi, You can set them or in the values.yaml and run helm install mygalera -f values.yaml . or by using helm install mygalera ... --set galera.bootstrap.forceSafeToBootstrap=yes. You can see more examples here.

Sorry - I've already installed, I can't install again changing helm stuff. My current cluster got itself confused, MANY times where it will no longer boot up because of the error:

It is not safe to bootstrap form this node ('safe_to_bootstrap=0' is set in 'grastate.dat'). If you want to force bootstrap, set the environment variable MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes

I can't nano on the container or change the grastate.dat... I'm at a loss. It keeps happening as well.

Within Rancher, I tried to upgrade the helm chart and modify the values.yaml:

helm upgrade --history-max=5 --install=true --namespace=mariadb-galera --timeout=10m0s --values=/home/shell/helm/values-mariadb-galera-5.6.7.yaml --version=5.6.7 --wait=true mariadb-galera /home/shell/helm/mariadb-galera-5.6.7.tgz
--
Tue, Apr 20 2021 2:11:19 am | checking 5 resources for changes
Tue, Apr 20 2021 2:11:19 am | error updating the resource "mariadb-galera":
Tue, Apr 20 2021 2:11:19 am | cannot patch "mariadb-galera" with kind StatefulSet: StatefulSet.apps "mariadb-galera" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden
Tue, Apr 20 2021 2:11:19 am | Error: UPGRADE FAILED: cannot patch "mariadb-galera" with kind StatefulSet: StatefulSet.apps "mariadb-galera" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden

Bit of a problem. I can't upgrade the chart, I can't start my cluster.

- Next time I plan to delete the app/helm in Rancher, keep the volume and re-add the helm chart using the same volumes. Then set forcedbootstrap true, and bootstrap from node 0.

In the meantime, I just removed everything, set forcedbootstrap and node 0 to make my life easier. Hoping I can do above if there's problems.

How to I reinstall the Helm chart using credentials already made by my installation?

rafariossaa commented 3 years ago

Hi, The procedure to bootstrap from other node is here. You could also, stop the chart, start a new pod that mount one of the PVCs, edit the file from that pod and then start again the cluster. Mounting the PVC to a POD should be very similar to what is done here

rafariossaa commented 3 years ago

To indicate the chart the credentials you need to use rootUser.password setting. The instruction to get the password and indicate when running helm install or you can retrieve the instructions by runnint helm status <name_of_your_deployment>.

voarsh2 commented 3 years ago

I have a bug where it's ignoring my bootstrap. Here's the Statefulset YAML export.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  annotations:
    meta.helm.sh/release-name: maria-ceph
    meta.helm.sh/release-namespace: cornerstepapp
  creationTimestamp: "2021-06-04T19:24:28Z"
  generation: 120
  labels:
    app.kubernetes.io/instance: maria-ceph
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: mariadb-galera
    helm.sh/chart: mariadb-galera-5.8.2
  managedFields:
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:meta.helm.sh/release-name: {}
          f:meta.helm.sh/release-namespace: {}
        f:labels:
          .: {}
          f:app.kubernetes.io/instance: {}
          f:app.kubernetes.io/managed-by: {}
          f:app.kubernetes.io/name: {}
          f:helm.sh/chart: {}
      f:spec:
        f:podManagementPolicy: {}
        f:revisionHistoryLimit: {}
        f:selector: {}
        f:serviceName: {}
        f:template:
          f:metadata:
            f:labels:
              .: {}
              f:app.kubernetes.io/instance: {}
              f:app.kubernetes.io/name: {}
          f:spec:
            f:affinity:
              .: {}
              f:podAntiAffinity:
                .: {}
                f:requiredDuringSchedulingIgnoredDuringExecution: {}
            f:containers:
              k:{"name":"mariadb-galera"}:
                .: {}
                f:env:
                  .: {}
                  k:{"name":"BITNAMI_DEBUG"}:
                    .: {}
                    f:name: {}
                  k:{"name":"MARIADB_DATABASE"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"MARIADB_ENABLE_LDAP"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"MARIADB_ENABLE_TLS"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"MARIADB_GALERA_CLUSTER_ADDRESS"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"MARIADB_GALERA_CLUSTER_NAME"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"MARIADB_GALERA_MARIABACKUP_PASSWORD"}:
                    .: {}
                    f:name: {}
                    f:valueFrom:
                      .: {}
                      f:secretKeyRef:
                        .: {}
                        f:key: {}
                        f:name: {}
                  k:{"name":"MARIADB_GALERA_MARIABACKUP_USER"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"MARIADB_ROOT_PASSWORD"}:
                    .: {}
                    f:name: {}
                    f:valueFrom:
                      .: {}
                      f:secretKeyRef:
                        .: {}
                        f:key: {}
                        f:name: {}
                  k:{"name":"MARIADB_ROOT_USER"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"MY_POD_NAME"}:
                    .: {}
                    f:name: {}
                    f:valueFrom:
                      .: {}
                      f:fieldRef:
                        .: {}
                        f:apiVersion: {}
                        f:fieldPath: {}
                f:image: {}
                f:imagePullPolicy: {}
                f:livenessProbe:
                  .: {}
                  f:exec:
                    .: {}
                    f:command: {}
                  f:failureThreshold: {}
                  f:initialDelaySeconds: {}
                  f:periodSeconds: {}
                  f:successThreshold: {}
                  f:timeoutSeconds: {}
                f:name: {}
                f:ports:
                  .: {}
                  k:{"containerPort":3306,"protocol":"TCP"}:
                    .: {}
                    f:containerPort: {}
                    f:name: {}
                    f:protocol: {}
                  k:{"containerPort":4444,"protocol":"TCP"}:
                    .: {}
                    f:containerPort: {}
                    f:name: {}
                    f:protocol: {}
                  k:{"containerPort":4567,"protocol":"TCP"}:
                    .: {}
                    f:containerPort: {}
                    f:name: {}
                    f:protocol: {}
                  k:{"containerPort":4568,"protocol":"TCP"}:
                    .: {}
                    f:containerPort: {}
                    f:name: {}
                    f:protocol: {}
                f:readinessProbe:
                  .: {}
                  f:exec:
                    .: {}
                    f:command: {}
                  f:failureThreshold: {}
                  f:initialDelaySeconds: {}
                  f:periodSeconds: {}
                  f:successThreshold: {}
                  f:timeoutSeconds: {}
                f:resources: {}
                f:terminationMessagePath: {}
                f:terminationMessagePolicy: {}
                f:volumeMounts:
                  .: {}
                  k:{"mountPath":"/bitnami/mariadb"}:
                    .: {}
                    f:mountPath: {}
                    f:name: {}
                  k:{"mountPath":"/opt/bitnami/mariadb/.bootstrap"}:
                    .: {}
                    f:mountPath: {}
                    f:name: {}
                  k:{"mountPath":"/opt/bitnami/mariadb/conf/my.cnf"}:
                    .: {}
                    f:mountPath: {}
                    f:name: {}
                    f:subPath: {}
            f:dnsPolicy: {}
            f:restartPolicy: {}
            f:schedulerName: {}
            f:securityContext:
              .: {}
              f:fsGroup: {}
              f:runAsUser: {}
            f:serviceAccount: {}
            f:serviceAccountName: {}
            f:terminationGracePeriodSeconds: {}
            f:tolerations: {}
            f:volumes:
              .: {}
              k:{"name":"mariadb-galera-config"}:
                .: {}
                f:configMap:
                  .: {}
                  f:defaultMode: {}
                  f:name: {}
                f:name: {}
              k:{"name":"previous-boot"}:
                .: {}
                f:emptyDir: {}
                f:name: {}
        f:updateStrategy:
          f:type: {}
        f:volumeClaimTemplates: {}
    manager: Go-http-client
    operation: Update
    time: "2021-06-07T20:59:40Z"
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:template:
          f:spec:
            f:containers:
              k:{"name":"mariadb-galera"}:
                f:command: {}
                f:env:
                  k:{"name":"BITNAMI_DEBUG"}:
                    f:value: {}
    manager: Mozilla
    operation: Update
    time: "2021-06-07T21:00:39Z"
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:replicas: {}
        f:template:
          f:metadata:
            f:annotations:
              .: {}
              f:cattle.io/timestamp: {}
              f:field.cattle.io/ports: {}
    manager: rancher
    operation: Update
    time: "2021-06-07T21:02:02Z"
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:collisionCount: {}
        f:currentRevision: {}
        f:observedGeneration: {}
        f:replicas: {}
        f:updateRevision: {}
    manager: kube-controller-manager
    operation: Update
    time: "2021-06-07T21:02:25Z"
  name: maria-ceph-mariadb-galera
  namespace: cornerstepapp
  resourceVersion: "16762982"
  uid: ff378cdb-bf99-4026-87d7-fbf9855644a9
spec:
  podManagementPolicy: OrderedReady
  replicas: 0
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/instance: maria-ceph
      app.kubernetes.io/name: mariadb-galera
  serviceName: maria-ceph-mariadb-galera
  template:
    metadata:
      annotations:
        cattle.io/timestamp: "2021-06-05T17:16:59Z"
        field.cattle.io/ports: '[[{"containerPort":3306,"dnsName":"maria-ceph-mariadb-galera","kind":"ClusterIP","name":"mysql","protocol":"TCP"},{"containerPort":4567,"dnsName":"maria-ceph-mariadb-galera","kind":"ClusterIP","name":"galera","protocol":"TCP"},{"containerPort":4568,"dnsName":"maria-ceph-mariadb-galera","kind":"ClusterIP","name":"ist","protocol":"TCP"},{"containerPort":4444,"dnsName":"maria-ceph-mariadb-galera","kind":"ClusterIP","name":"sst","protocol":"TCP"}]]'
      creationTimestamp: null
      labels:
        app.kubernetes.io/instance: maria-ceph
        app.kubernetes.io/name: mariadb-galera
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/instance: maria-ceph
                app.kubernetes.io/name: mariadb-galera
            namespaces:
            - cornerstepapp
            topologyKey: kubernetes.io/hostname
      containers:
      - command:
        - bash
        - -ec
        - |
          # Bootstrap from the indicated node
          NODE_ID="${MY_POD_NAME#"maria-ceph-mariadb-galera-"}"
          if [[ "$NODE_ID" -eq "1" ]]; then
              export MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes
              export MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes
          fi
          exec /opt/bitnami/scripts/mariadb-galera/entrypoint.sh /opt/bitnami/scripts/mariadb-galera/run.sh
        env:
        - name: MY_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: BITNAMI_DEBUG
          value: "true"
        - name: MARIADB_GALERA_CLUSTER_NAME
          value: galera
        - name: MARIADB_GALERA_CLUSTER_ADDRESS
          value: gcomm://maria-ceph-mariadb-galera-headless.cornerstepapp.svc.cluster.local
        - name: MARIADB_ROOT_USER
          value: root
        - name: MARIADB_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              key: mariadb-root-password
              name: maria-ceph-mariadb-galera
        - name: MARIADB_DATABASE
          value: my_database
        - name: MARIADB_GALERA_MARIABACKUP_USER
          value: mariabackup
        - name: MARIADB_GALERA_MARIABACKUP_PASSWORD
          valueFrom:
            secretKeyRef:
              key: mariadb-galera-mariabackup-password
              name: maria-ceph-mariadb-galera
        - name: MARIADB_ENABLE_LDAP
          value: "no"
        - name: MARIADB_ENABLE_TLS
          value: "no"
        image: docker.io/bitnami/mariadb-galera:10.5.10-debian-10-r13
        imagePullPolicy: IfNotPresent
        livenessProbe:
          exec:
            command:
            - bash
            - -ec
            - |
              password_aux="${MARIADB_ROOT_PASSWORD:-}"
              if [[ -f "${MARIADB_ROOT_PASSWORD_FILE:-}" ]]; then
                  password_aux=$(cat "$MARIADB_ROOT_PASSWORD_FILE")
              fi
              exec mysqladmin status -u"${MARIADB_ROOT_USER}" -p"${password_aux}"
          failureThreshold: 3
          initialDelaySeconds: 120
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: mariadb-galera
        ports:
        - containerPort: 3306
          name: mysql
          protocol: TCP
        - containerPort: 4567
          name: galera
          protocol: TCP
        - containerPort: 4568
          name: ist
          protocol: TCP
        - containerPort: 4444
          name: sst
          protocol: TCP
        readinessProbe:
          exec:
            command:
            - bash
            - -ec
            - |
              password_aux="${MARIADB_ROOT_PASSWORD:-}"
              if [[ -f "${MARIADB_ROOT_PASSWORD_FILE:-}" ]]; then
                  password_aux=$(cat "$MARIADB_ROOT_PASSWORD_FILE")
              fi
              exec mysqladmin status -u"${MARIADB_ROOT_USER}" -p"${password_aux}"
          failureThreshold: 3
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /opt/bitnami/mariadb/.bootstrap
          name: previous-boot
        - mountPath: /bitnami/mariadb
          name: data
        - mountPath: /opt/bitnami/mariadb/conf/my.cnf
          name: mariadb-galera-config
          subPath: my.cnf
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 1001
        runAsUser: 1001
      serviceAccount: default
      serviceAccountName: default
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 2
      volumes:
      - emptyDir: {}
        name: previous-boot
      - configMap:
          defaultMode: 420
          name: maria-ceph-mariadb-galera-configuration
        name: mariadb-galera-config
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/instance: maria-ceph
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/name: mariadb-galera
      name: data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 8Gi
      volumeMode: Filesystem
    status:
      phase: Pending
status:
  collisionCount: 0
  currentRevision: maria-ceph-mariadb-galera-5b84fd7b97
  observedGeneration: 120
  replicas: 0
  updateRevision: maria-ceph-mariadb-galera-5b84fd7b97

See 0 completely ignore it:

mariadb 21:07:20.34 
mariadb 21:07:20.35 Welcome to the Bitnami mariadb-galera container
mariadb 21:07:20.35 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-mariadb-galera
mariadb 21:07:20.35 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-mariadb-galera/issues
mariadb 21:07:20.36 
mariadb 21:07:20.36 INFO  ==> ** Starting MariaDB setup **
mariadb 21:07:20.41 INFO  ==> Validating settings in MYSQL_*/MARIADB_* env vars
mariadb 21:07:20.46 DEBUG ==> Set Galera cluster address to gcomm://
mariadb 21:07:20.47 INFO  ==> Initializing mariadb database
mariadb 21:07:20.48 DEBUG ==> Ensuring expected directories/files exist
mariadb 21:07:20.52 WARN  ==> The mariadb configuration file '/opt/bitnami/mariadb/conf/my.cnf' is not writable or does not exist. Configurations based on environment variables will not be applied for this file.
mariadb 21:07:20.53 INFO  ==> ** MariaDB setup finished! **
mariadb 21:07:20.65 DEBUG ==> Set Galera cluster address to gcomm://
mariadb 21:07:20.67 INFO  ==> ** Starting MariaDB **
mariadb 21:07:20.67 INFO  ==> Setting previous boot
2021-06-07 21:07:20 0 [Note] /opt/bitnami/mariadb/sbin/mysqld (mysqld 10.5.10-MariaDB-log) starting as process 1 ...
2021-06-07 21:07:20 0 [Note] WSREP: Loading provider /opt/bitnami/mariadb/lib/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2021-06-07 21:07:20 0 [Note] WSREP: wsrep_load(): loading provider library '/opt/bitnami/mariadb/lib/libgalera_smm.so'
2021-06-07 21:07:20 0 [Note] WSREP: wsrep_load(): Galera 4.8(rXXXX) by Codership Oy <info@codership.com> loaded successfully.
2021-06-07 21:07:20 0 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
2021-06-07 21:07:20 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0
2021-06-07 21:07:20 0 [Note] WSREP: GCache DEBUG: opened preamble:
Version: 2
UUID: 3d574c0c-c7ba-11eb-84c6-eaa9b9140442
Seqno: -1 - -1
Offset: -1
Synced: 1
2021-06-07 21:07:20 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 3d574c0c-c7ba-11eb-84c6-eaa9b9140442, offset: -1
2021-06-07 21:07:20 0 [Note] WSREP: GCache::RingBuffer initial scan...  0.0% (        0/134217752 bytes) complete.
2021-06-07 21:07:20 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217752/134217752 bytes) complete.
2021-06-07 21:07:20 0 [Note] WSREP: Recovering GCache ring buffer: didn't recover any events.
2021-06-07 21:07:20 0 [Note] WSREP: Passing config to GCS: base_dir = /bitnami/mariadb/data/; base_host = 10.42.4.161; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /bitnami/mariadb/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; p
2021-06-07 21:07:20 0 [Note] WSREP: Start replication
2021-06-07 21:07:20 0 [Note] WSREP: Connecting with bootstrap option: 1
2021-06-07 21:07:20 0 [Note] WSREP: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2021-06-07 21:07:20 0 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
2021-06-07 21:07:20 0 [ERROR] WSREP: wsrep::connect(gcomm://) failed: 7
2021-06-07 21:07:20 0 [ERROR] Aborting

The only thing that seems to have worked is clone the disk from stateful/node 1 (known to have worked before this mess), put it as statefulset node 0, set forcetobootstrapfor node 0.... and it magically works. But it this is weird, it "normally" works for whatever bootstrap number I choose.

rafariossaa commented 3 years ago

Hi, You can see in the statefull export:

      containers:
      - command:
        - bash
        - -ec
        - |
          # Bootstrap from the indicated node
          NODE_ID="${MY_POD_NAME#"maria-ceph-mariadb-galera-"}"
          if [[ "$NODE_ID" -eq "1" ]]; then
              export MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes
              export MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes
          fi
          exec /opt/bitnami/scripts/mariadb-galera/entrypoint.sh /opt/bitnami/scripts/mariadb-galera/run.sh

The if only would set the environment vars used for bootstraping if it is node 1. This code is rendered from this yaml file So, you will need to indicate the node you want to bootstrap from.

iomarcovalente commented 2 years ago

@rafariossaa based on your input:

Hi, I would like to add some clarification to this issue. This misbehavior was fixed, but I found that in the README that the detail about starting in Parallel is missing. I will fix that. I think the current approach is quite similar to what you propose. The indicate node is the one that is going to start first and the rest will sync with this one, and only the indicated one is going to start first. On the "stopping cleanly" side, when you do helm delete kubernetes will start deleting the resources (pods, sts, etc) in the order kubernetes estimate it is best, so the only way to guarantee there is a node safe to bootstrap is making it the only node in the cluster by scaling it to 1 node.

This is what I tried to verify the starting from other node, could you give it a try ?

To produce the situation:
* Launch:
$ helm repo update
$ helm install mygalera bitnami/mariadb-galera --set rootUser.password=mypwd --set galera.mariabackup.password=otherpwd
...

In this case node 0 is older, it has a lower seqno, and there is not a node safe to bootstrap from.

Restart: As there is not a safe to bootstrap node, let's pick node 2 to bootstrap from.
$ helm install mygalera bitnami/mariadb-galera \
    --set rootUser.password=mypwd \
    --set galera.mariabackup.password=otherpwd \
    --set galera.bootstrap.bootstrapFromNode=2 \
    --set galera.bootstrap.forceSafeToBootstrap=true \
    --set podManagementPolicy=Parallel
When starting, if you monitor the pods status:
$ watch -n 1 kubectl get pods
You will notice that node '0' and '1', will be in crashloop until node '2' is running.

I have tried to do that, delete node-0 first and then delete the chart after a couple of minutes with helm delete mariadb What happened effectively was that node-0 stopped, it went crashlooping and node-1 and -2 followed and went into a crashlooping state. When I then run the script to retrieve the grastate file I got:

pod/grastate-finder-0 created
# GALERA saved state
version: 2.1
uuid:    5a078885-4dde-11ec-8e3f-c76f745a04ea
seqno:   -1
safe_to_bootstrap: 0
 pod "grastate-finder-0" deleted
pod/grastate-finder-1 created
# GALERA saved state
version: 2.1
uuid:    5a078885-4dde-11ec-8e3f-c76f745a04ea
seqno:   -1
safe_to_bootstrap: 0
 pod "grastate-finder-1" deleted
pod/grastate-finder-2 created
# GALERA saved state
version: 2.1
uuid:    5a078885-4dde-11ec-8e3f-c76f745a04ea
seqno:   -1
safe_to_bootstrap: 0
 pod "grastate-finder-2" deleted

all states have got seqno set to -1 so none is safe to bootstrap and cannot tell which one is the most recent. I am aware about running mysqld --user=root --wsrep-recover | grep -i transaction and verifying the transaction id value. The highest, the most up to date the node is. But not sure whether that approach can be embedded in the chart.

Ultimately I agree with the general sentiment that this chart is not safe to use for any type of workflow really as it is not reliable enough.

rafariossaa commented 2 years ago

Hi, Which chart and image version are you using ? Some changes where introduced recently and I would like to try to reproduce it.

When running the script to retrieve grstate, did you do it with the chart stopped ?.

I am sorry you have that sentiment, we try to fix and cover most of the scenarios and this is a complex chart because the initialization. I am not trying to justify the issues and we are thankful for the feedback provided that allow us to improve it.

iomarcovalente commented 2 years ago

Hi, nothing to be sorry about - it is just a note to my future self or anyone who needs to use it for anything that is for production use. It is an open source chart after all and it is already in a decent state as it is.

I do appreciate the challenges as I have been deploying galera from vanilla over the last two years and only recently decided to move to this chart as I spent way too much time to get it to work as I faced the exact challenges you are facing.

When running the script to retrieve grstate, did you do it with the chart stopped ?.

yes, the pods were deleted otherwise one wouldn't be able to mount the volumes anyway I am currently running 10.5 and the reason I am running that version is because I need to go through an upgrade process and cannot jump too many versions, but what I could do is to work with 10.5 and incrementally upgrade up to latest if latest works better as you said. Which changes are you referring to specifically?

rafariossaa commented 2 years ago

Thanks for you kind words.

Regarding the status, if the cluster shutdown the 3 nodes at the same time, it is not able to store the seqno and safe_to_bootstrap. When galera is running and you get into any of pods, you can check that galera keeps seqno=-1 . The change I was referring to is this one: https://github.com/bitnami/bitnami-docker-mariadb-galera/pull/53

timsamart commented 2 years ago

hmmm i am still struggling with this. I want to restart my cluster and unfortunately the container gets terminated before I can even change the configuration in grastate to reconfigure once. grastate is just not mounted in the pod because the container dies early. Any ideas? I really do not want to use a sidecar.

timsamart commented 2 years ago

Or is there a possibility to use the StatefulSet to configure mariadb galera? I am using the BKPR suite and it seems it is not strait forward to configure the set. Any Ideas? It would be also okay to reset the mariadb galera config.

Mauraza commented 2 years ago

Hi @timsamart,

Could you share what version of MariaDB are you using and the logs?

bitnami / charts

Unclean shutdown of mariadb-galera clusters leaves pod in CrashLoopBackOff state #2260