Open kkavin opened 1 year ago
We need to know if we can use a GCS bucket for the velero and restic pods.
Yes, Velero could work with a GCS bucket. https://github.com/vmware-tanzu/velero-plugin-for-gcp#setup
Will Velero work with more than 1 replicas?
No. Velero server does not work with more than 1 replica.
we have planned to add persistent storage for the velero and restic pods instead of emptyDir.
I did not tried it before but I think it's possible and doable.
@jenting What data is filled in emptyDir path ? is housekeeping of this path not done by velero ? I think there are temporary data under this path.
@jenting Can you please let us know what data are stored in the /scratch or emptyDir ? Often, we are getting issue in the velero pod it has been evicted due to disk pressure or the node was on low disk space ephemeral storage error.
@qiuming-best could you help this issue?
@kkavin Velero server could not work with more than 1 replica, it'll have concurrency issues currently.
The scratch dir it's a place where Restic put its' cache in it, and the empty dir is where Velero put its' third-party plugin.
All of the Restic cache or third-party plugins are temp files, so we didn't put them into persistent volume.
But for your problem, you could put them into persistent volume and it's work.
Hi @qiuming-best and @jenting,
I just came across the same issue. As you can see the node that velero locates got a spike of usage of node filesystem size in a short time.
And then, it was evicted by kubelet.
"kind":"Pod","namespace":"velero","name":"velero-c4844d876-bvntd","uid":"1960a28a-15e6-44da-ab2b-65bf77616020","apiVersion":"v1","resourceVersion":"452456224"},"reason":"Evicted","message":"The node was low on resource: ephemeral-storage. Threshold quantity: 5119338572, available: 4544316Ki. Container velero was using 211020Ki, request is 0, has larger consumption of ephemeral-storage.
I just wondering why the ephemeral storage that emptyDir
consumes grows rapidly at this short period and I'm sure there is neither restic backup(pv backup) nor object backup performed. So when does velero or restic store data to the emptyDir
?
What steps did you take and what happened: Velero pod was evicted due to disk full in worker nodes in GKE.
We raised a support ticket with Google Cloud regarding the pod eviction due to the storage issue in the worker node. They reported that:
Following their analysis, we have planned to add persistent storage for the velero and restic pods instead of emptyDir.
We need to know if we can use a GCS bucket for the velero and restic pods. By default, the Helm chart comes with 1 replica. Is it possible to add more than 1 replicas? Will Velero work with more than 1 replicas?
Environment:
helm version
): v3.7.2kubectl version
): 1.23.0