Closed sanjeev3d closed 4 months ago
Even I have used these Env variable which related to system.backup_list
apiVersion: v1
kind: ConfigMap
metadata:
name: clickhouse-backup-config
data:
LOG_LEVEL: "debug"
ALLOW_EMPTY_BACKUPS: "true"
API_LISTEN: "0.0.0.0:7171"
API_CREATE_INTEGRATION_TABLES: "true"
BACKUPS_TO_KEEP_REMOTE: "3"
REMOTE_STORAGE: "s3"
S3_ACL: "private"
S3_ENDPOINT: "http://xxxxxxx:xx"
S3_BUCKET: "clickhouse"
S3_PATH: "backup/shard-{shard}"
S3_ACCESS_KEY: "minioadmin"
S3_SECRET_KEY: "minioadmin"
S3_FORCE_PATH_STYLE: "true"
S3_DISABLE_SSL: "true"
S3_DEBUG: "true"
could you share result for following command
kubectl get chi --all-namespaces
?
@Slach I'm using click Namespace
kubectl get chi --all-namespaces
NAMESPACE NAME CLUSTERS HOSTS STATUS click-zoo clickhouse-poc 1 1 Completed click-zoo cliff 1 9 Completed click cliff 1 9 Completed zoo cliff zoo zoo
could you share
kubectl get chi -n click cliff -o yaml
without sensitive credentials?
Moreover, could you share results of the following command:
kubectl get pods --all-namespaces -l app=clickhouse-operator -o jsonpath="{.items[*].spec.containers[*].image}"
Sharing output of kubectl get chi -n click cliff -o yaml
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
annotations:
finalizers:
- finalizer.clickhouseinstallation.altinity.com
generation: 3
manager: clickhouse-operator
operation: Update
time: "2024-07-22T08:06:16Z"
name: cliff
namespace: click
spec:
configuration:
clusters:
- layout:
shards:
- name: shard0
replicas:
- name: replica0-shard0
- name: replica1-shard0
- name: replica2-shard0
templates:
podTemplate: pod-template-with-volumes-replica
replicasCount: 3
templates:
podTemplate: pod-template-with-volumes-shard
- name: shard1
replicas:
- name: replica0-shard1
- name: replica1-shard1
- name: replica2-shard1
templates:
podTemplate: pod-template-with-volumes-replica
replicasCount: 3
templates:
podTemplate: pod-template-with-volumes-shard
- name: shard2
replicas:
- name: replica0-shard2
- name: replica1-shard2
- name: replica2-shard2
templates:
podTemplate: pod-template-with-volumes-replica
replicasCount: 3
templates:
podTemplate: pod-template-with-volumes-shard
name: cliffcluster
settings:
disable_internal_dns_cache: 1
remote_servers/all-replicated/secret: default
remote_servers/all-sharded/secret: default
remote_servers/cliffcluster/secret: default
users:
admin/access_management: 1
admin/networks/ip:
- 0.0.0.0/0
- ::/0
admin/password: xxxxxx
default/networks/ip:
- 0.0.0.0/0
- ::/0
zookeeper:
nodes:
- host: zookeeper-0.zookeepers.click
port: 2181
- host: zookeeper-1.zookeepers.click
port: 2181
- host: zookeeper-2.zookeepers.click
port: 2181
defaults:
templates:
podTemplate: pod-template-with-volumes-shard
serviceTemplate: chi-service-template
templates:
podTemplates:
- name: pod-template-with-volumes-shard
spec:
containers:
- image: clickhouse-server:24.4.2-alpine
name: clickhouse
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-storage-template-1
- command:
- bash
- -xc
- /bin/clickhouse-backup server
envFrom:
- configMapRef:
name: clickhouse-backup-config
image: clickhouse-backup:master
imagePullPolicy: Always
name: clickhouse-backup
ports:
- containerPort: 7171
name: backup-rest
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "2"
memory: 4Gi
- name: pod-template-with-volumes-replica
spec:
containers:
- image: clickhouse-server:24.4.2-alpine
name: clickhouse
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-storage-template
serviceTemplates:
- generateName: clickhouse-{chi}
name: chi-service-template
spec:
ports:
- name: http
port: 8123
targetPort: 8123
- name: tcp
port: 9000
targetPort: 9000
- name: interserver
port: 9009
targetPort: 9009
type: NodePort
volumeClaimTemplates:
- name: clickhouse-storage-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi
storageClassName: robin-encrypt
- name: clickhouse-storage-template-1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi
storageClassName: robin-encrypt
... skip ...
updated: 9
version: 0.18.0
kubectl get pods --all-namespaces -l app=clickhouse-operator -o jsonpath="{.items[].spec.containers[].image}"
altinity/clickhouse-operator:0.18.0 altinity/metrics-exporter:0.18.0 altinity/clickhouse-operator:0.18.0 altinity/metrics-exporter:0.18.0
ok. i see root cause
you defined 3 replicas in each shard and separately defines in 3rd replica in each shard
- name: replica2-shardX
templates:
podTemplate: pod-template-with-volumes-replica
but pod-template-with-volumes-replica
doesn't contains spec.containers[] with backup
remove
templates:
podTemplate: pod-template-with-volumes-replica
from all 3 shards
and remove
- name: pod-template-with-volumes-replica
spec:
containers:
- image: clickhouse-server:24.4.2-alpine
name: clickhouse
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-storage-template
and remove
- name: clickhouse-storage-template-1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi
storageClassName: robin-encrypt
and replace clickhouse-storage-template-1
to clickhouse-storage-template
after that all 3 replicas in each shard should contains system.backup_list
and system.backup_actions
command
@Slach Still Facing same issue even after removing another pod and volume template as mentioned
Sharing again cluster details from output after applying changes
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
annotations:
creationTimestamp: "2024-04-29T11:47:47Z"
finalizers:
- finalizer.clickhouseinstallation.altinity.com
generation: 4
name: cliff
namespace: click
resourceVersion: "830274872"
uid: b41d47c9-5427-4d60-9ce9-88faebfcb184
spec:
configuration:
clusters:
- layout:
shards:
- name: shard0
replicas:
- name: replica0-shard0
- name: replica1-shard0
- name: replica2-shard0
replicasCount: 3
templates:
podTemplate: pod-template-with-volumes-shard
- name: shard1
replicas:
- name: replica0-shard1
- name: replica1-shard1
- name: replica2-shard1
replicasCount: 3
templates:
podTemplate: pod-template-with-volumes-shard
- name: shard2
replicas:
- name: replica0-shard2
- name: replica1-shard2
- name: replica2-shard2
replicasCount: 3
templates:
podTemplate: pod-template-with-volumes-shard
name: cliffcluster
settings:
disable_internal_dns_cache: 1
remote_servers/all-replicated/secret: default
remote_servers/all-sharded/secret: default
remote_servers/cliffcluster/secret: default
users:
admin/access_management: 1
admin/networks/ip:
- 0.0.0.0/0
- ::/0
admin/password: xxxxxx
default/networks/ip:
- 0.0.0.0/0
- ::/0
zookeeper:
nodes:
- host: zookeeper-0.zookeepers.click
port: 2181
- host: zookeeper-1.zookeepers.click
port: 2181
- host: zookeeper-2.zookeepers.click
port: 2181
defaults:
templates:
podTemplate: pod-template-with-volumes-shard
serviceTemplate: chi-service-template
templates:
podTemplates:
- name: pod-template-with-volumes-shard
spec:
containers:
- image: clickhouse-server:24.4.2-alpine
name: clickhouse
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-storage-template
- command:
- bash
- -xc
- /bin/clickhouse-backup server
envFrom:
- configMapRef:
name: clickhouse-backup-config
image: clickhouse-backup:master
imagePullPolicy: Always
name: clickhouse-backup
ports:
- containerPort: 7171
name: backup-rest
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "2"
memory: 4Gi
serviceTemplates:
- generateName: clickhouse-{chi}
name: chi-service-template
spec:
ports:
- name: http
port: 8123
targetPort: 8123
- name: tcp
port: 9000
targetPort: 9000
- name: interserver
port: 9009
targetPort: 9009
type: NodePort
volumeClaimTemplates:
- name: clickhouse-storage-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi
storageClassName: robin-encrypt
share logs
kubectl logs -n click chi-cliff-cliffcluster-replica0-shard0 --container=clickhouse-backup --since=48h
share logs
kubectl logs -n click chi-cliff-cliffcluster-replica0-shard0 --container=clickhouse-backup --since=48h
kubectl logs chi-cliff-cliffcluster-replica0-shard0-0 --container=clickhouse-backup --since=48h -n click
+ /bin/clickhouse-backup server
2024/07/22 13:58:51.873895 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/07/22 13:58:51.875379 warn clickhouse connection ping: tcp://localhost:9000 return error: dial tcp [::1]:9000: connect: connection refused, will wait 5 second to reconnect logger=clickhouse
2024/07/22 13:58:56.879248 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/07/22 13:58:56.883755 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse
2024/07/22 13:58:56.883899 info Create integration tables logger=server
2024/07/22 13:58:56.883954 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/07/22 13:58:56.885679 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse
2024/07/22 13:58:56.885769 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2024/07/22 13:58:56.891874 info SELECT countIf(name='type') AS is_disk_type_present, countIf(name='object_storage_type') AS is_object_storage_type_present, countIf(name='free_space') AS is_free_space_present, countIf(name='disks') AS is_storage_policy_present FROM system.columns WHERE database='system' AND table IN ('disks','storage_policies') logger=clickhouse
2024/07/22 13:58:56.915501 info SELECT d.path, any(d.name) AS name, any(lower(if(d.type='ObjectStorage',d.object_storage_type,d.type))) AS type, min(d.free_space) AS free_space, groupUniqArray(s.policy_name) AS storage_policies FROM system.disks AS d LEFT JOIN (SELECT policy_name, arrayJoin(disks) AS disk FROM system.storage_policies) AS s ON s.disk = d.name GROUP BY d.path logger=clickhouse
2024/07/22 13:58:56.931456 info SELECT engine FROM system.databases WHERE name = 'system' logger=clickhouse
2024/07/22 13:58:56.935621 info clickhouse connection closed logger=clickhouse
2024/07/22 13:58:56.935677 error open /var/lib/clickhouse/flags/force_drop_table: no such file or directory logger=server.Run
2024/07/22 13:58:56.936335 info Starting API server 9121cd4192cfa2e8c84a1fc21822ab8c8e660f8a on 0.0.0.0:7171 logger=server.Run
2024/07/22 13:58:56.939474 info Update backup metrics start (onlyLocal=false) logger=server
2024/07/22 13:58:56.939535 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/07/22 13:58:56.939602 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/07/22 13:58:56.941886 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse
2024/07/22 13:58:56.941961 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2024/07/22 13:58:56.942362 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse
2024/07/22 13:58:56.942410 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2024/07/22 13:58:56.946445 info SELECT countIf(name='type') AS is_disk_type_present, countIf(name='object_storage_type') AS is_object_storage_type_present, countIf(name='free_space') AS is_free_space_present, countIf(name='disks') AS is_storage_policy_present FROM system.columns WHERE database='system' AND table IN ('disks','storage_policies') logger=clickhouse
2024/07/22 13:58:56.948150 info SELECT countIf(name='type') AS is_disk_type_present, countIf(name='object_storage_type') AS is_object_storage_type_present, countIf(name='free_space') AS is_free_space_present, countIf(name='disks') AS is_storage_policy_present FROM system.columns WHERE database='system' AND table IN ('disks','storage_policies') logger=clickhouse
2024/07/22 13:58:56.966350 info SELECT d.path, any(d.name) AS name, any(lower(if(d.type='ObjectStorage',d.object_storage_type,d.type))) AS type, min(d.free_space) AS free_space, groupUniqArray(s.policy_name) AS storage_policies FROM system.disks AS d LEFT JOIN (SELECT policy_name, arrayJoin(disks) AS disk FROM system.storage_policies) AS s ON s.disk = d.name GROUP BY d.path logger=clickhouse
2024/07/22 13:58:56.969567 info SELECT d.path, any(d.name) AS name, any(lower(if(d.type='ObjectStorage',d.object_storage_type,d.type))) AS type, min(d.free_space) AS free_space, groupUniqArray(s.policy_name) AS storage_policies FROM system.disks AS d LEFT JOIN (SELECT policy_name, arrayJoin(disks) AS disk FROM system.storage_policies) AS s ON s.disk = d.name GROUP BY d.path logger=clickhouse
2024/07/22 13:58:56.981439 error ResumeOperationsAfterRestart return error: open /var/lib/clickhouse/backup: no such file or directory logger=server.Run
2024/07/22 13:58:56.981911 info clickhouse connection closed logger=clickhouse
2024/07/22 13:58:56.981997 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/07/22 13:58:56.984384 info clickhouse connection success: tcp://localhost:9000 logger=clickhouse
2024/07/22 13:58:56.984453 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse
2024/07/22 13:58:56.994293 info SELECT macro, substitution FROM system.macros logger=clickhouse
2024/07/22 13:58:56.997715 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse
2024/07/22 13:58:57.008841 info SELECT macro, substitution FROM system.macros logger=clickhouse
2024/07/22 13:58:57.017919 info [s3:DEBUG] Request
GET /clickhouse?versioning= HTTP/1.1
Host: [CLUSTER_VIP]:32621
User-Agent: aws-sdk-go-v2/1.26.1 os/linux lang/go#1.22.3 md/GOOS#linux md/GOARCH#amd64 api/s3#1.53.1
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: 0434c027-694d-41a0-b177-ee6b055a89f7
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=minioadmin/20240722/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-date, Signature=1d2c05e97229eb08bcb0c68e97b03397c754072fe41e6b58913e8b7594cfca35
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240722T135857Z
2024/07/22 13:58:57.021584 info [s3:DEBUG] Response
HTTP/1.1 200 OK
Content-Length: 99
Accept-Ranges: bytes
Content-Type: application/xml
Date: Mon, 22 Jul 2024 13:58:57 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
X-Amz-Request-Id: 17E48DAE3B4328B9
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block
2024/07/22 13:58:57.021947 debug /tmp/.clickhouse-backup-metadata.cache.S3 not found, load 0 elements logger=s3
2024/07/22 13:58:57.023029 info [s3:DEBUG] Request
GET /clickhouse?delimiter=%2F&list-type=2&max-keys=1000&prefix=backup%2Fshard-shard0%2F HTTP/1.1
Host: [CLUSTER-VIP]:32621
User-Agent: aws-sdk-go-v2/1.26.1 os/linux lang/go#1.22.3 md/GOOS#linux md/GOARCH#amd64 api/s3#1.53.1
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: b21de8d0-8a70-4c38-9472-c82854220b2e
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=minioadmin/20240722/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-date, Signature=7424dcb80f140232bfc6ac3b2beaa756d96b538add420b80c15f47ef82607eac
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240722T135857Z
2024/07/22 13:58:57.025337 info [s3:DEBUG] Response
HTTP/1.1 200 OK
Content-Length: 285
Accept-Ranges: bytes
Content-Type: application/xml
Date: Mon, 22 Jul 2024 13:58:57 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
X-Amz-Request-Id: 17E48DAE3B744C98
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block
2024/07/22 13:58:57.026245 debug /tmp/.clickhouse-backup-metadata.cache.S3 save 0 elements logger=s3
2024/07/22 13:58:57.026420 info clickhouse connection closed logger=clickhouse
2024/07/22 13:58:57.026467 info Update backup metrics finish LastBackupCreateLocal=<nil> LastBackupCreateRemote=<nil> LastBackupSizeLocal=0 LastBackupSizeRemote=0 LastBackupUpload=<nil> NumberBackupsLocal=0 NumberBackupsRemote=0 duration=87ms logger=server
you miss mount
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-storage-template
inside clickhouse-backup
container
mounted only in clickhouse
container
change spec.containers section
templates:
podTemplates:
- name: pod-template-with-volumes-shard
spec:
containers:
- image: clickhouse-server:24.4.2-alpine
name: clickhouse
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-storage-template
- name: clickhouse-backup
command:
- bash
- -xc
- /bin/clickhouse-backup server
envFrom:
- configMapRef:
name: clickhouse-backup-config
image: clickhouse-backup:stable
imagePullPolicy: Always
volumeMounts:
- mountPath: /var/lib/clickhouse
name: clickhouse-storage-template
ports:
- containerPort: 7171
name: backup-rest
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "2"
memory: 4Gi
and replace altinity/clickhouse-backup:master
to altinity/clickhouse-backup:stable
I have been using ClickHouse for many days without any issues. Recently, I added an additional container to the pod specifically for the backup utility. Now I am getting the following error:
Code: 60. DB::Exception: Received from chi-cliff-cliffcluster-replica0-shard0.click.svc.cluster.local:9000. DB::Exception: Unknown table expression identifier 'system.backup_list' in scope SELECT name FROM system.backup_list WHERE (location = 'remote') AND (name LIKE '%chi-cliff-cliffcluster-replica0-shard0.click.svc.cluster.local%') AND (name LIKE '%full%') AND (desc NOT LIKE 'broken%') ORDER BY created DESC LIMIT 1.
Steps to Reproduce:
Expected Behavior:
The backup utility should identify and use the system.backup_list table as expected without throwing an unknown table expression identifier error.
Observed Behavior:
An error indicating an unknown table expression identifier for system.backup_list is encountered.
References:
Altinity ClickHouse Backup Examples