Closed nemanja1209 closed 2 months ago
Same thing here with v1.13.2. Did backup, deleted backup, waited 3 days, no kopia maintenance done, well, no files deleted... Just activated debug log level and will send the corresponding logs tomorrow.
I guess there is a problem with Kopia maintenance jobs that should be executed automatically in the background. Quick cycle Interval is customised (for the test) to 2mins. When the timer goes off, there is only displayed next run: now and nothing is happening. When I execute maintenance manually, everything works as expected. Timer is reseted to 2mins and when goes off there is again next run: now. Same situation is with the Full cycle job. At the Recent Maintenance Runs there is only one record (when repository is initialised).
Owner: default@default
Quick Cycle:
scheduled: true
interval: 2m0s
next run: now
Full Cycle:
scheduled: true
interval: 24h0m0s
next run: 2024-04-26 17:58:39 CEST (in 8h22m58s)
Log Retention:
max count: 10000
max age of logs: 720h0m0s
max total size: 1.1 GB
Object Lock Extension: disabled
Recent Maintenance Runs:
cleanup-epoch-manager:
2024-04-25 17:58:40 CEST (0s) SUCCESS
cleanup-logs:
2024-04-25 17:58:40 CEST (0s) SUCCESS
full-rewrite-contents:
2024-04-25 17:58:40 CEST (0s) SUCCESS
snapshot-gc:
2024-04-25 17:58:39 CEST (0s) SUCCESS
Workaround:
I created cronjob that executes kopia maintenance —full
every 2 hours.
It is actually simulating manual command execution that I have confirmed it works as expected.
how do you start the command manually?
kubectl exec -n velero velero-66b9bc65c7-tp7jt -- "kopia maintenance --full" Defaulted container "velero" out of: velero, velero-velero-plugin-for-aws (init), velero-velero-plugin-for-csi (init) error: Internal error occurred: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "e4648fa3b4fd7089d18e4809b644fd56b406cad3018cbdcb9a549ba6eb085279": OCI runtime exec failed: exec failed: unable to start container process: exec: "kopia maintenance --full": executable file not found in $PATH: unknown
Seems as if kopia full maintenance is working here:
time="2024-05-06T12:24:33Z" level=info msg="Running maintenance on backup repository" backupRepo=velero/backuptest-default-kopia-msvvv logSource="pkg/controller/backup_repository_controller.go:289" time="2024-05-06T12:24:35Z" level=info msg="Running full maintenance..." logModule=kopia/maintenance logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Running full maintenance..." logModule=kopia/kopia/format logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Rewriting contents from short packs..." logModule=kopia/maintenance logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Total bytes rewritten 0 B" logModule=kopia/maintenance logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Not enough time has passed since previous successful Snapshot GC. Will try again next time." logModule=kopia/maintenance logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Skipping blob deletion because not enough time has passed yet (59m59s left)." logModule=kopia/maintenance logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Cleaned up 0 logs." logModule=kopia/maintenance logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Cleaning up old index blobs which have already been compacted..." logModule=kopia/maintenance logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Finished full maintenance." logModule=kopia/maintenance logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]" time="2024-05-06T12:24:35Z" level=info msg="Finished full maintenance." logModule=kopia/kopia/format logSource="pkg/kopia/kopia_log.go:94" logger name="[shared-manager]"
For data security consideration, Kopia repo keeps the unused data for some time before fully deleting it. Therefore, you need to keep the maintenance running completely until it is safe for the repo to delete the unused data.
Have the same issue. Storage keeps growing indefinitely. Even after I delete namespace on the cluster and delete velero backup, after multiple days the kopia/{namespace} directory on s3 is still present and new files are written there(even if namespace does exist on the cluster anymore). Old files don't delete either. storage keeps growing.
The namespace staging-oleksandr-besu-fe6e
was delete more than 24h ago. backuprepositories
crd still present. I've delete all backups that can be related to staging-oleksandr-besu-fe6e
namespace. In the bucket I still see new files appear every backup schedule run:
My namespace:
NAME STATUS AGE
btp Active 64d
btp-platform Active 34d
clustermanager Active 422d
default Active 422d
development Active 64d
ingress Active 422d
kube-node-lease Active 422d
kube-public Active 422d
kube-system Active 422d
shared Active 422d
staging-besu1n1-a5d2 Active 22d
staging-ext1n1-46d0 Active 22d
velero Active 104d
My backuprepositories:
NAME AGE REPOSITORY TYPE
staging-besu1n1-a5d2-default-kopia-q9lgw 41h kopia
staging-besu3n1-c06e-default-kopia-q8cg6 18h kopia
staging-besu4n1-d840-default-kopia-p6mp8 17h kopia
staging-besu5n1-a944-default-kopia-45lpx 16h kopia
staging-ext1n1-46d0-default-kopia-djvg5 41h kopia
staging-oleksandr-besu-fe6e-default-kopia-96bqm 40h kopia
Log related my namespace from velero:
Defaulted container "velero" out of: velero, velero-plugin-for-aws (init)
time="2024-05-17T07:15:22Z" level=info msg="Running maintenance on backup repository" backupRepo=velero/staging-oleksandr-besu-fe6e-default-kopia-96bqm logSource="pkg/controller/backup_repository_controller.go:290"
time="2024-05-17T07:47:32Z" level=info msg="Processing item" backup=velero/hourly-20240517074722 logSource="pkg/backup/backup.go:365" name=staging-oleksandr-besu-fe6e-default-kopia-96bqm namespace=velero progress= resource=backuprepositories.velero.io
time="2024-05-17T07:47:32Z" level=info msg="Backing up item" backup=velero/hourly-20240517074722 logSource="pkg/backup/item_backupper.go:179" name=staging-oleksandr-besu-fe6e-default-kopia-96bqm namespace=velero resource=backuprepositories.velero.io
time="2024-05-17T07:47:32Z" level=info msg="Backed up 1118 items out of an estimated total of 1129 (estimate will change throughout the backup)" backup=velero/hourly-20240517074722 logSource="pkg/backup/backup.go:405" name=staging-oleksandr-besu-fe6e-default-kopia-96bqm namespace=velero progress= resource=backuprepositories.velero.io
time="2024-05-17T07:50:23Z" level=info msg="invoking DeleteItemAction plugins" item=staging-oleksandr-besu-fe6e-default-kopia-96bqm logSource="internal/delete/delete_item_action_handler.go:116" namespace=velero
Why it's keep writing files to the folder on the bucket, even if ns doesn't exist? And why it's still not deleted? As I understand kopia run full maintenance every 24h. It mean we should not see this folder on the bucket at all?
I think you have to delete all backuprepositories if you want to stop writing new files to kopia repository. Also, my guess is that files come from maintenance job:
kubectl describe backuprepositories.velero.io mq-default-kopia-8tk9v
Name: mq-default-kopia-8tk9v
Namespace: velero
Labels: velero.io/repository-type=kopia
velero.io/storage-location=default
velero.io/volume-namespace=mq
Annotations: <none>
API Version: velero.io/v1
Kind: BackupRepository
Metadata:
Creation Timestamp: 2024-05-17T07:26:50Z
Generate Name: mq-default-kopia-
Generation: 4
Managed Fields:
API Version: velero.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:generateName:
f:labels:
.:
f:velero.io/repository-type:
f:velero.io/storage-location:
f:velero.io/volume-namespace:
f:spec:
.:
f:backupStorageLocation:
f:maintenanceFrequency:
f:repositoryType:
f:resticIdentifier:
f:volumeNamespace:
f:status:
.:
f:lastMaintenanceTime:
f:message:
f:phase:
Manager: velero-server
Operation: Update
Time: 2024-05-17T08:27:15Z
Resource Version: 3848152
UID: 6013f45b-4d61-46e6-83aa-67a4ab507898
Spec:
Backup Storage Location: default
**Maintenance Frequency: 1h0m0s
Repository Type: kopia**
Restic Identifier: s3:https://name.mydomain.com:9000/velero-dev/restic/mq
Volume Namespace: mq
Status:
Last Maintenance Time: 2024-05-17T07:26:51Z
Can you paste output of following command (when backup is working and files are not being deleted):
kopia maintenance info
Before that step, you have to connect to kopia repository, command example:
kopia repository connect s3 --endpoint name.mydomain.com:{portifneeded} --bucket bucket-name --access-key your-key --secret-access-key your-secret --disable-tls-verification --prefix kopia/namespace/ --password 'static-passw0rd'
Last argument password can be found at velero namespace under the secret velero-repo-credentials.
Hey @nemanja1209. I have dynamic namespaces on my envs, so I'have a lot of backuprepositories all the time. I thought when I delete backup which tie to current backuprepository, backuprepository should be deleted automatically with next kopia full maintenance run? If it's not true, it means I need to cleanup already deleted namespaces manually all the time?
apiVersion: velero.io/v1
kind: BackupRepository
metadata:
creationTimestamp: "2024-05-15T15:06:34Z"
generateName: staging-oleksandr-besu-fe6e-default-kopia-
generation: 46
labels:
velero.io/repository-type: kopia
velero.io/storage-location: default
velero.io/volume-namespace: staging-oleksandr-besu-fe6e
name: staging-oleksandr-besu-fe6e-default-kopia-96bqm
namespace: velero
resourceVersion: "198127217"
uid: 5f125d9d-0d3d-46fd-8329-fef7c54e77d3
spec:
backupStorageLocation: default
maintenanceFrequency: 1h0m0s
repositoryType: kopia
resticIdentifier: s3:s3-eu-central-1.amazonaws.com/mybucket/restic/staging-oleksandr-besu-fe6e
volumeNamespace: staging-oleksandr-besu-fe6e
status:
lastMaintenanceTime: "2024-05-17T10:18:35Z"
phase: Ready
> kopia maintenance info
Owner: default@default
Quick Cycle:
scheduled: true
interval: 1h0m0s
next run: now
Full Cycle:
scheduled: true
interval: 24h0m0s
next run: 2024-05-17 19:10:59 EEST (in 5h29m28s)
Log Retention:
max count: 10000
max age of logs: 720h0m0s
max total size: 1.1 GB
Object Lock Extension: disabled
Recent Maintenance Runs:
full-drop-deleted-content:
2024-05-16 19:10:59 EEST (0s) SUCCESS
full-rewrite-contents:
2024-05-15 19:10:59 EEST (0s) SUCCESS
snapshot-gc:
2024-05-16 19:10:59 EEST (0s) SUCCESS
2024-05-15 19:10:59 EEST (0s) SUCCESS
cleanup-epoch-manager:
2024-05-16 19:11:00 EEST (0s) SUCCESS
2024-05-15 19:10:59 EEST (0s) SUCCESS
cleanup-logs:
2024-05-16 19:11:00 EEST (0s) SUCCESS
2024-05-15 19:10:59 EEST (0s) SUCCESS
full-delete-blobs:
2024-05-16 19:11:00 EEST (0s) SUCCES
BTW, can we adjust those value via velero helm chart, I didn't find how to do it.
Backuprepository is CRD. CRDs are never deleted. Deleting a CRD automatically deletes all of the CRD's contents across all namespaces in the cluster. Consequently, Helm will not delete CRDs.
Also, my guess about not deleting files is that maintenance job is not executing in the way it should. After connection to repoistory you can try to execute this command manually ( to speed up maintenance schedule ) and check if the files from previous backups are still there.
kopia maintenance run --full --safety=none
Expecting result is to have only files from backups that are not expired yet. All other files should be removed. At our environment this workaround works as expected, the only problem is that we had to create cron job with this command to execute in some scheduled time. As you can see
Quick Cycle:
scheduled: true
interval: 1h0m0s
next run: now
suggests that maintenance should be done immediately but it is not
@nemanja1209 any tweak how to run full maintenance?
kopia maintenance run --full --safety=none
ERROR maintenance must be run by designated user: default@default
Didn't find another way how to set the owner.
try with this one
kopia maintenance set --owner=me
dont change anything, me means that current user will get permission
Thx. Now I can run full maintenance. The log of execution:
Running full maintenance...
Looking for active contents...
Looking for unreferenced contents...
GC found 72 unused contents (18.6 MB)
GC found 0 unused contents that are too recent to delete (0 B)
GC found 0 in-use contents (0 B)
GC found 57 in-use system-contents (32.6 KB)
Rewriting contents from short packs...
Total bytes rewritten 18.6 MB
Found safe time to drop indexes: 2024-05-17 14:20:39.478219 +0300 EEST
Dropping contents deleted before 2024-05-17 14:20:39.478219 +0300 EEST
Looking for unreferenced blobs...
Deleted total 73 unreferenced blobs (18.9 MB)
Compacting an eligible uncompacted epoch...
Cleaning up no-longer-needed epoch markers...
Attempting to compact a range of epoch indexes ...
Cleaning up unneeded epoch markers...
Cleaning up old index blobs which have already been compacted...
Cleaned up 0 logs.
Finished full maintenance.
Indd some files deleted from the bucket, but still see a lot of old _log.*
files. As I understand it's not possible to cleanup them with kopia?
The only way to cleanup it fully:
And this process can be done only manually, am I right?
That is the only way as I know. I would highly appreciate if someone can tell to us better way :)
Hello, I have a question :
How to run the kopia maintenance --full
command if installed velero using chart ?
Like @thomasklosinsky, i tried with kubectl but I got this error :
I'm facing the issue that it's been a month, and kopia repository in s3 still increases its size
I did it with Kopia CLI. Firstly, I installed it.
Then you need to connect to repository and change repository owner to me as described in messages above.
kopia repository connect s3 --endpoint name.mydomain.com:{portifneeded} --bucket bucket-name --access-key your-key --secret-access-key your-secret --disable-tls-verification --prefix kopia/namespace/ --password 'static-passw0rd'
kopia maintenance set --owner=me
kopia maintenance --full
So in my scenario, let's say I have a pod that manages all velero operation that can interact with velero API. Called it "toolbox", I should install kopia CLI too in this pod so I can connect to my repo ?
We have the similar situation, but only difference is that we have VM instead of pod. So I guess you can try that way
Ok thanks! I'll try that ! Thank you so much ;)
Do you confirm that in your case, when you ran the kopia maintenance
command, it cleans the kopia/ folder in the s3 bucket ?
Yes, but it depends on the arguments you enter.
If you set safety=none > $ kopia maintenance run --full --safety=none
than it should clean repository immediately.
If you run just kopia mainenance run --full
Kopia have some mechanism that calulates is it safe to delete all unused objects so you will need to wait some more time
OK thanks, I'll try it right away!
I ran the command, and I get this error :
Do you know how to fix it ? It's my first time using kopia
Please post your kopia repository connect command
i messed up, i didn't put the correct endpoint, but now I get a different error, this is the command that I runned :
kopia repository connect s3 --endpoint s3.eu-west-2.wasabisys.com --bucket xxxxxxxxxxx --access-key xxxxxxxxxxxxxx --secret-access-key xxxxxxxxxxxxxxxxxxxxxx --prefix=kopia/namespace/ --password 'XXXXXXXXXXX'
And below is the error :
Could be wrong prefix. Did you put literally word namespace or did you replace with real name of namespace? It should be the path like at s3
I put the real name of namespace, I just replaced it to paste here as example. so I should start like s3://bucket/prefix ?
No, I asked about this line prefix=kopia/namespace/
namespace should be changed to real one
Yes I changed it to the real one like in my s3 bucket
Can you check the path of Kopia file at the bucket? This is example
So, the bucket name is testbucket and prefix is
prefix kopia/prometheus/
also I think character “=“ should not be there
Ok, I'll try to remove it, I putted it following kopia documentation. Thanks for the insight
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.
This issue was closed because it has been stalled for 14 days with no activity.
As #6916 is closed without real solution I'm opening it again.
Here is some additional information about my config:
Velero server version: 1.13
Backup storage location:
Backup storage repository:
There is a backup schedule job:
It creates backup every 6 hours and TTL is 7 hours. That means that at some point I have 2 backups (for 1 hour) and after that period only one is left.
Output of velero backup get is good, there is only one backup (or two if you execute the command one hour after creation of the last backup):
This PVC backed up is big around 30GB. After 5 days, Minio showed around 58GB used space and when I entered the bucket, there were files from the first backup. At this time, after 7 days, Minio shows aroung 77GB used space.
Currently, there are about 5300 files in the bucket. Maybe the more than a half are from the first backup. It seems that the first backup lasted for about 15min, because the first file is created at 20:00 and the last one at 20:16.
For this 7 days, there were executed around 26 backups (every 6 hours) and each one (except first one) didn't last for more than 2min. (difference between first and last created file). These files are also still present in the bucket.
So, the rest 25 backups hold half of the files in the bucket and first one backup other half (rough estimation). It is like the bucket do some kind of versioning (incremental backup). For this bucket, object locking (versioning) is disabled.
When I disable schedule, delete all backups and wait for a few days, Kopia snapshots are still there in the bucket under following path ${bucket_name}/kopia/${namespace}. Everything within ${bucket_name}/backups/ is deleted as expected.