Open docent-net opened 4 years ago
I think this is more than useful and should definitely comes outside of the box. @docent-net any chance you can get a stab at this fairly soon ?
Sure, on it :)
@tlvenn I need to ask a question, about this. Read through ArgoCD docs (e.g. here https://argoproj.github.io/argo-cd/operator-manual/high_availability/#argocd-repo-server) and it looked like a straightforward thing - just create volumeClaimTemplates with proper configuration for the PVC and that's all.
However, repo-server can be replicated, so here it hit me, that maybe we'd need a StatefulSet here instead of Deployment, because there is some state.. or not? Unfortunately, I'm not that familiar w/ArgoCD HA, but after going through the documentation and thinking about it I have 2 assumptions to discuss:
After writing this whole post I think that the correct answer is nr.1, but just need to validate it w/you :) Thanks
Hi @docent-net , I am no expert when it comes to argo cd but my feeling is that the repo server is a cache layer and I would opt for option 1 as well.
For illustration purpose, not saying this a good idea, Grafana chart for example supports both approaches:
https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/deployment.yaml#L1 https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/statefulset.yaml#L1
I'm going w/opt1 - lets face it - the data is stored in /tmp, so let's treat it as temporary :) Thanks, will get into it
Stale issue message
@docent-net did you make any progress on this ?
I'm a little confused...I am trying to deploy ArgoCD in HA mode and use a PVC for the repo-server to avoid the disk consumption issue, described above. Of course, there is an issue with one PVC. Option 1 above mentions leave as a Deployment and use volumeClaimTemplates....but those aren't supported for Deployments, only SS. Am I missing something? Do I just convert to SS?
Can this issue be re-opened? Right now the only possible option without making changes to the helm chart is to use emptyDir
Please reopen this issue:
I'm using Longhorn replicated blockstorage, indeed it is my default storageclass and I would love to make use of it for argocd. This should be the default behavior at least for non-HA, @tlvenn gave a good example for a working non-HA/HA Chart:
For illustration purpose, not saying this a good idea, Grafana chart for example supports both approaches:
https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/deployment.yaml#L1 https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/statefulset.yaml#L1
Furthermore as @docent-net already pointed out, this Chart is partially making use of the ability to set a custom storageclass:
Additional context
I can see, that similar feature is already implemented in redis-ha:
https://kubernetes.io/docs/concepts/storage/_print/#generic-ephemeral-volumes is alpha
in 1.19
, beta
in 1.21
and stable
in 1.23
.
Please reopen this issue.
@mkilchhofer could you please take a look?
@pierluigilenoci I don't understand why someone wants to configure PV on Argo CD components.
We only implement use cases which are supported by the upstream project itself (https://github.com/argoproj/argo-cd).
One thing I am okay with is to add the ability to deploy any kind of resources (extra manifests) like the bitnami charts do.
@mkilchhofer quoted from the official docs:
argocd-repo-server clones repository into /tmp ( of path specified in TMPDIR env variable ). Pod might run out of disk space if have too many repository or repositories has a lot of files. To avoid this problem mount persistent volume.
The problem is they officially tell you to do something without telling you how to.
FWIW, I'm using my suggested approach of ephemeral volumes while keeping it a Deployment
but I know that's not available to everyone since it's in 1.21+ as beta:
ACK. That makes sense. There's already an issue upstream for that: https://github.com/argoproj/argo-cd/issues/7927
@mkilchhofer I don't have a particular fetish for PVs.
I just wish that in the presence of large caches, a situation aggravated also by some bug (https://github.com/argoproj/argo-cd/issues/4002 and https://github.com/helm/helm/issues/10583), the Repo Server pods are not evicted ('Pod The node had condition: [DiskPressure]. '
) or, even worse, other pods are evicted in the same node (exactly what happened in my case).
In any case the idea of allowing the deployment of extra manifests is always a good idea (and if you look at my history on GitHub it is something I particularly appreciate).
Thanks for your help and your time.
It might make sense to link the CNCF's Slack (stranded) discussion as well: https://cloud-native.slack.com/archives/C020XM04CUW/p1653511733823829
@docent-net any update on this?
@pierluigilenoci sure. I've been following this discussion (as well as that mentioned on Slack and regarding StatefulSets) and I'm not sure right now.
This issue is about Deployment. My original problem was that having replica=1 for Deployment type, I wanted to have persistent storage for the git cache. Thanks to that, in case of POD replacement, my infra wouldn't be impacted with traffic/performance spike due to cache rebuilding / git repos pulling on a higher scale.
And I have implemented that, and it works on my cluster correctly. But I have only one replica.
Now - with replicas>1 there's problem of AccessModes, that might be very specific. Having it set to ReadWriteMany (and that is a requirement for this workload), could mean a huge performance drop due to underlying file system configurations.
Why do I think so? Thinking about possible scenarios, where this PVC would be used I can only think of considerable caches with many, many (possible little) files (that is how git repositories looks like, thinking .git directories, indexes etc). And FS synchronization across nodes with such huge distributed disk might be really problematic. Now think of possible issues, that could occur as a side effect of it. The user has inconsistent git directories because of some consistency problem of underlying DFS.
Having this said, I'm not sure anymore, that this PR is a good idea. StatefulSet might be a better approach here.
I can still raise a PR, but I'm just afraid, that someone someday will have a big headache due to this.
@docent-net sorry for the delay in answering.
The answer you are looking for is Statefulset
.
If you have multiple repo-server replicas you must also have multiple disks, one for each replica.
So the solution for assigning 1:1 disks is Statefulset
.
Using NFS is a problem of overcomplication and poor performance. From personal experience I can say that sometimes a cache on NFS is slower than re-loading everything from scratch over the network.
The alternative solution is to use emptyDir
but it does not solve the node disk load problem.
Unless you put the disk in memory but in this case the cost increases.
The perfect solution does not exist but something needs to be done.
I've one solution to propose 1) default deploy behaviour is the current one with emptyDir
2) if a different volume definition is provided in values file then it's applied as-is, without any additional implementation, leaving final user the desired approach for volume management .
repoServer:
existingVolumes:
tmp_dir:
persistentVolumeClaim:
claimName: pvc-argocd-repo-server-tmp
I've performed such changes on my forked repo for details: https://github.com/argoproj/argo-helm/compare/main...aroundthecode:argo-helm:reposerver-pvc
With such apporach user can create PV/PVC outside helm and then attach to the deploy, or leave everything as is. It could be a quick fix allowing user to avoid diskpressure issue and going on discussing Statefulset migration
Hello there! Do you intend to open a PR for this solution so we can have it in the oficial Helm Chart? I am trying to migrate an internal chart that we use to the official one and this is one of the features that we would need in order to achieve that. For now I will create a fork but I think that this config should be in the oficial Helm Chart as it is recommended by the High Availability doc.
@aroundthecode
I've performed such changes on my forked repo for details: https://github.com/argoproj/argo-helm/compare/main...aroundthecode:argo-helm:reposerver-pvc
So, are you able to raise a PR for this, since you already have it worked out?
I ended up creating a PR to add support to deploy argocd-repo-server as a StatefulSet with PVCs based on what we used in my company: https://github.com/argoproj/argo-helm/pull/1648 Feel free to add suggestions or editing it.
@arielly-parussulo, your PR is a bit stranded...
@aroundthecode
I've performed such changes on my forked repo for details: main...aroundthecode:argo-helm:reposerver-pvc
So, are you able to raise a PR for this, since you already have it worked out?
HI all, I've tried to perform an alternative PR #2410 with my approach on this, not sure if everthing is fine, please check and advise if needed
@aroundthecode oddly enough I was probably about a week away from making a PR since we're about to start using the helm chart. Yours looks more comprehensive though. Thanks!
@snuggie12 ( or anyone) can you please help me with the PR? it seems something is missing in the docs generation but I can't realize what.
I've added the helm-docs comments to my values.yaml section but is seems something is still missing since it's telling me the documentation is outdated.
@snuggie12 ( or anyone) can you please help me with the PR? it seems something is missing in the docs generation but I can't realize what.
I've added the helm-docs comments to my values.yaml section but is seems something is still missing since it's telling me the documentation is outdated.
@aroundthecode It's running helm-docs
in the check and then doing a git diff
and seeing there's a change to be made. What it wants you to do is run helm-docs
yourself and commit the updated README.md
. According to this there should be a script which cleanly executes (it mentions a container so I think you'll need some sort of container runtime like docker or containerd/cri.) If that's an issue in your env you can always install the helm plugin yourself.
Or if you just want to cheat you can look at the diff in the GH action output and paste it in the README.md (mostly kidding as you should probably follow their instructions, but wanted to leave it as a last restort.)
Is your feature request related to a problem? Please describe.
I'm trying to mount remote storage to the argocd repo-server. I can provide volumeMounts list as well as volumes list, but I can't create PVC for those volumes using helm chart.
Describe the solution you'd like
volumes.yaml could contain a definition of StorageClass name for PVC. When provided, the volume mount would be created on top of this PVC.
Describe alternatives you've considered
Alternatively, documentation could be extended with information about using PVC and manual creation of PVC.
Additional context
I can see, that similar feature is already implemented in redis-ha: https://github.com/argoproj/argo-helm/blob/d7da8e863f30a6975b89a1bba34855d83bed59e3/charts/argo-cd/charts/redis-ha/templates/redis-ha-statefulset.yaml#L285
Also, this request is only about repo-server, but similar could be requested for Redis and other components.
If this looks like something useful, I can provide PR.