pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.88k stars 5.8k forks source link

Backup clean taks fail using azure storage account and Kubernetes #54849

Open viniciusvarzea opened 1 month ago

viniciusvarzea commented 1 month ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Create a TiDB AKS cluster, create a BackupSchedule yaml file.

set spec.backupTemplate.cleanPolicy field to -> Delete.

When the AKS cluster performs the cleanup JOB, the job fails, looking at the storage account, i can see that all files are deleted, but all the folders remain (empty), it causes the cleanup JOB to report fail state.

2. What did you expect to see? (Required)

The cleanup job finish succesfully.

3. What did you see instead (Required)

The cleanup job fails.

4. What is your TiDB version? (Required)

v8.1.0

Release Version: v8.1.0 Edition: Community Git Commit Hash: 945d07c5d5c7a1ae212f6013adfb187f2de24b23 Git Branch: HEAD UTC Build Time: 2024-05-21 03:54:24 GoVersion: go1.21.10 Race Enabled: false Check Table Before Drop: false Store: tikv

viniciusvarzea commented 1 month ago

The logs from cleanup task container:

It seems related to the rclone config for azure blob storage, please see: https://github.com/rclone/rclone/issues/7047

Create rclone.conf file. /tidb-backup-manager clean --namespace=tidb-cluster-01-backup --backupName=schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 Sleeping for 10 seconds before clean... I0725 06:15:10.839362 9 clean.go:69] start to clean backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 I0725 06:15:10.840007 9 clean.go:170] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 1, start to clean backup with opt: {PageSize:10000 RetryCount:5 BackoffEnabled:false BatchDeleteOption:{DisableBatchConcurrency:false BatchConcurrency:10 RoutineConcurrency:100} SnapshotsDeleteRatio:0} I0725 06:15:10.927034 9 clean.go:189] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 1-1, try to delete 4 objects E0725 06:15:11.002782 9 clean.go:203] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 1-1, delete 4 objects failed I0725 06:15:11.002816 9 clean.go:223] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 1, clean backup finished, total:4 deleted:0 failed:4 E0725 06:15:11.002825 9 clean.go:163] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 1, failed to clean backup: some objects failed to be deleted I0725 06:15:11.002830 9 clean.go:170] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 2, start to clean backup with opt: {PageSize:10000 RetryCount:5 BackoffEnabled:false BatchDeleteOption:{DisableBatchConcurrency:false BatchConcurrency:10 RoutineConcurrency:100} SnapshotsDeleteRatio:0} I0725 06:15:11.047452 9 clean.go:189] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 2-1, try to delete 4 objects E0725 06:15:11.109367 9 clean.go:203] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 2-1, delete 4 objects failed I0725 06:15:11.109402 9 clean.go:223] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 2, clean backup finished, total:4 deleted:0 failed:4 E0725 06:15:11.109411 9 clean.go:163] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 2, failed to clean backup: some objects failed to be deleted I0725 06:15:11.109436 9 clean.go:170] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 3, start to clean backup with opt: {PageSize:10000 RetryCount:5 BackoffEnabled:false BatchDeleteOption:{DisableBatchConcurrency:false BatchConcurrency:10 RoutineConcurrency:100} SnapshotsDeleteRatio:0} I0725 06:15:11.156129 9 clean.go:189] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 3-1, try to delete 4 objects E0725 06:15:11.192233 9 clean.go:203] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 3-1, delete 4 objects failed I0725 06:15:11.192262 9 clean.go:223] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 3, clean backup finished, total:4 deleted:0 failed:4 E0725 06:15:11.192270 9 clean.go:163] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 3, failed to clean backup: some objects failed to be deleted I0725 06:15:11.192275 9 clean.go:170] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 4, start to clean backup with opt: {PageSize:10000 RetryCount:5 BackoffEnabled:false BatchDeleteOption:{DisableBatchConcurrency:false BatchConcurrency:10 RoutineConcurrency:100} SnapshotsDeleteRatio:0} I0725 06:15:11.223305 9 clean.go:189] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 4-1, try to delete 4 objects E0725 06:15:11.262858 9 clean.go:203] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 4-1, delete 4 objects failed I0725 06:15:11.262905 9 clean.go:223] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 4, clean backup finished, total:4 deleted:0 failed:4 E0725 06:15:11.262913 9 clean.go:163] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 4, failed to clean backup: some objects failed to be deleted I0725 06:15:11.262918 9 clean.go:170] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 5, start to clean backup with opt: {PageSize:10000 RetryCount:5 BackoffEnabled:false BatchDeleteOption:{DisableBatchConcurrency:false BatchConcurrency:10 RoutineConcurrency:100} SnapshotsDeleteRatio:0} I0725 06:15:11.303037 9 clean.go:189] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 5-1, try to delete 4 objects E0725 06:15:11.333956 9 clean.go:203] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 5-1, delete 4 objects failed I0725 06:15:11.333988 9 clean.go:223] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 5, clean backup finished, total:4 deleted:0 failed:4 E0725 06:15:11.333996 9 clean.go:163] For backup tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 clean 5, failed to clean backup: some objects failed to be deleted E0725 06:15:11.334005 9 manager.go:102] clean cluster tidb-cluster-01-backup/schedule-backup-tidb-cluster-01-2024-07-17t06-00-00 backup azure://tidb-backups/full-backup/tidb-cluster-01-pd.tidb-cluster-01-2379-2024-07-17t06-00-00/ failed, err: some objects failed to be deleted error: some objects failed to be deleted Sleeping for 10 seconds before exit...

viniciusvarzea commented 1 month ago

@tirsen @gregwebs

viniciusvarzea commented 4 weeks ago

If you guys need more details, please feel free to ask.

viniciusvarzea commented 3 weeks ago

Just another information, the cleanup process fail to delete the empty folders on both (log backup and full backup)