argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.9k stars 5.46k forks source link

ArgoCD 2.1.8 repo server filling up helm cache #8773

Closed vvoinea-gpsw closed 2 years ago

vvoinea-gpsw commented 2 years ago

Checklist:

Describe the bug

After upgrading from ArgoCD 1.8 to 2.1.8 we are seeing the argocd-repo-server pods filling up the empyDir helm-working-dir to the amount of 100GB in 2 days -> the node disk is being filled (empyDir maps to host disk OR RAM) and pods are getting evicted. Tried to limit the cache using the obvious parameters: reposerver.default.cache.expiration=1h or reposerver.repo.cache.expiration=2h But neither can limit the creation of new <some-hash>charts.yamland <some-hash>index.yaml files in the /helm-working-dir/repository on the pod every 3 minutes. Since empyDir can also map to RAM we have also seen higher memory usages similar to this https://github.com/argoproj/argo-cd/issues/8698

To Reproduce

Install 2.1.8 and monitor disk and memory usage

Expected behavior There would be some parameter to limit the rotation of these cached charts Also if the volume is known to fill up this quickly (due to a perfect storm of large chart repo and small node disk size) could we limit the volume size by changing the volume setup out of the box ?

Screenshots

Version

argocd-server: v2.1.8+2fdaf7a

Logs

Paste any relevant application logs here.
crenshaw-dev commented 2 years ago

@vvoinea-gpsw there were some small changes around chart caching in recent 2.1.x versions. Are you able to upgrade to 2.1.12? I don't have much confidence that it will solve the problem, but I think for relatively effort it's a good starting point.

vvoinea-gpsw commented 2 years ago

@crenshaw-dev I just jumped ahead to 2.2.0 as I saw some cache changed in there too. Seeing the same caching that can't be controlled by any of the parameters: --default-cache-expiration duration --repo-cache-expiration --revision-cache-expiration

I don't understand why nobody else is seeing this? This behavior surely wasn't present on 2.0.0

crenshaw-dev commented 2 years ago

@vvoinea-gpsw I think the caching changes I'm thinking of were later in the 2.2 series. If you're able to go to the latest patch, that would be good.

But not really hopeful for that change. I'll need to come back to this next week because I'll need to read the caching code to get a good feeling for how to reproduce the issue.

vvoinea-gpsw commented 2 years ago

Thank you @crenshaw-dev for looking into this, I'll also be off next week but looking forward to seeing what you find. I will try to upgrade to the latest version and look for any rotation of the caches. One more details the yaml files being created every 3 minutes have the FULL yamls of ALL the helm charts in our private repo and every 3 minutes multiple files are created

vvoinea-gpsw commented 2 years ago

Hi @crenshaw-dev I upgraded to the latest release 2.3.3 where I still see ta large number of <some-hash>-index.yaml and <same-hash>-charts.txt being stored every 3 minutes. The odd thing is that at each interval we get multiple file for each extention and they all have respecively the same content (checked md5 hashes). Going to monitor it for a few days to see if the variables actually take effect and rotate these cached files.

Please let me know if you had time to replicate this issue

pepol commented 2 years ago

Hi, we experienced this issue last week. In our case, we were changing expired private git token and had to go in and restart repo server instances. After the restart, the emptyDir volume started filling, with new <hash>-charts.txt and <hash>-index.yaml files appearing every ~3min.

We noticed that reposerver had error logs about some helm charts not being found (specifically redis, external-dns, etc. from bitnami helm repositories). We were depending on old chart versions which seem to have already been removed from the (updated) index in the repos, thus reposerver wasn't able to find them (while they were actually deployed by Argo in the cluster already).

After bumping the applications to available helm chart versions, we stopped seeing these errors and the files stopped appearing / volume stopped getting filled excessively.

crenshaw-dev commented 2 years ago

So it seems to be limited to when the Helm chart no longer exists. Sounds like there's some cleanup logic that gets missed in that case.

Please 👍 this issue if it's affecting you. I don't have time to investigate at the moment, but more thumbs up will help the issue get attention. :-)

RandGenXYZ commented 2 years ago

Encountered the same problem with a chart removed by bitnami: See issue #9665 where I detailed the behaviour of the bug on my end

jmmclean commented 2 years ago

I believe I am hitting this issue as well, as this dir fills up, my memory usage gets really high too. doing a rolling restart of the argocd-repo-server resolves the issue. Also, resolving helm chart issues that cause an unknown status mitigates the symptoms.

thanks to @crenshaw-dev for all your help along the way!!!

CNCF link https://cloud-native.slack.com/archives/C01TSERG0KZ/p1655221266746929

running on v2.3.3

jmmclean commented 2 years ago

@crenshaw-dev this really needs some TLC. It occurs when at least one helm app is in an unknown state. for my group, we are self service so this happens from time to time, especially as people learn. The below is my memory usage...Im having to do a kubectl rollout restart deployment argocd-repo-server to make the memory usage happy again

image

crenshaw-dev commented 2 years ago

@jmmclean yikes. I'm still short on time. If anyone has repo-server logs mentioning "-index.yaml and -charts.txt," that would really help me pinpoint the problem code and get a patch out sooner. Bonus points for logs with LOG_LEVEL set to debug. :-)

PatrickZuell commented 2 years ago

Hello, we had the same issue for our redis. We changed the repo to: "https://raw.githubusercontent.com/bitnami/charts/pre-2022/bitnami" This temporary resolved the issue. We will now plan to update the helm chart

jmmclean commented 2 years ago

@crenshaw-dev I enabled debug and got the below logs (didnt seem too helpful, but I will let you decide that)

Command to acquire logs: stern argocd-repo | grep -v "with code OK" | grep -v "manifest cache hit" | grep -v "' resolved to"

Logs (scrubbed of company PII):

argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:48:59Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://charts.ops.mycompany.com,Path:,TargetRevision:911,Helm:&ApplicationSourceHelm{ValueFiles:[dev.yaml],Parameters:[]HelmParameter{HelmParameter{Name:global.container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},HelmParameter{Name:container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:af-execution-api,}/911"
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:48:59Z" level=info msg="helm pull --destination /tmp/43235af3-354b-437e-b356-1706cf3898ce --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api" dir= execID=c25b5
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:07Z" level=debug duration=8.173533236s execID=c25b5
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:07Z" level=error msg="`helm pull --destination /tmp/43235af3-354b-437e-b356-1706cf3898ce --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" execID=c25b5
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:07Z" level=info msg=Trace args="[helm pull --destination /tmp/43235af3-354b-437e-b356-1706cf3898ce --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api]" dir= operation_name="exec helm" time_ms=8173.745909000001
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:07Z" level=error msg="finished unary call with code Unknown" error="`helm pull --destination /tmp/43235af3-354b-437e-b356-1706cf3898ce --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" grpc.code=Unknown grpc.method=GenerateManifest grpc.service=repository.RepoServerService grpc.start_time="2022-07-21T12:48:59Z" grpc.time_ms=8175.766 span.kind=server system=grpc
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:34Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://charts.ops.mycompany.com,Path:,TargetRevision:911,Helm:&ApplicationSourceHelm{ValueFiles:[dev.yaml],Parameters:[]HelmParameter{HelmParameter{Name:global.container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},HelmParameter{Name:container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:af-execution-api,}/911"
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:34Z" level=info msg="helm pull --destination /tmp/18d17f44-e95c-4590-b4c8-7f4c2fd498e0 --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api" dir= execID=f1f5d
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:42Z" level=debug duration=8.068494871s execID=f1f5d
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:42Z" level=error msg="`helm pull --destination /tmp/18d17f44-e95c-4590-b4c8-7f4c2fd498e0 --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" execID=f1f5d
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:42Z" level=info msg=Trace args="[helm pull --destination /tmp/18d17f44-e95c-4590-b4c8-7f4c2fd498e0 --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api]" dir= operation_name="exec helm" time_ms=8068.727577000001
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:42Z" level=error msg="finished unary call with code Unknown" error="`helm pull --destination /tmp/18d17f44-e95c-4590-b4c8-7f4c2fd498e0 --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" grpc.code=Unknown grpc.method=GenerateManifest grpc.service=repository.RepoServerService grpc.start_time="2022-07-21T12:49:34Z" grpc.time_ms=8070.155 span.kind=server system=grpc
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:49Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://charts.ops.mycompany.com,Path:,TargetRevision:911,Helm:&ApplicationSourceHelm{ValueFiles:[dev.yaml],Parameters:[]HelmParameter{HelmParameter{Name:global.container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},HelmParameter{Name:container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:af-execution-api,}/911"
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:49Z" level=info msg="helm pull --destination /tmp/f8333e7b-23a9-4ecf-a791-1d2a9401498f --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api" dir= execID=a5633
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:58Z" level=debug duration=8.903038459s execID=a5633
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:58Z" level=error msg="`helm pull --destination /tmp/f8333e7b-23a9-4ecf-a791-1d2a9401498f --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" execID=a5633
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:58Z" level=info msg=Trace args="[helm pull --destination /tmp/f8333e7b-23a9-4ecf-a791-1d2a9401498f --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api]" dir= operation_name="exec helm" time_ms=8903.253155
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:58Z" level=error msg="finished unary call with code Unknown" error="`helm pull --destination /tmp/f8333e7b-23a9-4ecf-a791-1d2a9401498f --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" grpc.code=Unknown grpc.method=GenerateManifest grpc.service=repository.RepoServerService grpc.start_time="2022-07-21T12:49:49Z" grpc.time_ms=8905.308 span.kind=server system=grpc
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:04Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://charts.ops.mycompany.com,Path:,TargetRevision:911,Helm:&ApplicationSourceHelm{ValueFiles:[dev.yaml],Parameters:[]HelmParameter{HelmParameter{Name:global.container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},HelmParameter{Name:container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:af-execution-api,}/911"
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:04Z" level=info msg="helm pull --destination /tmp/9380dea1-9251-44de-9d48-09b122597aed --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api" dir= execID=b5546
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:12Z" level=debug duration=7.858755533s execID=b5546
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:12Z" level=error msg="`helm pull --destination /tmp/9380dea1-9251-44de-9d48-09b122597aed --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" execID=b5546
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:12Z" level=info msg=Trace args="[helm pull --destination /tmp/9380dea1-9251-44de-9d48-09b122597aed --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api]" dir= operation_name="exec helm" time_ms=7859.018099

Note: I had to do a rolling restart for the log level to take effect, so the memory issue is temporarily mitigated, however it will climb since there is a helm chart in an Unknown status

image

jmmclean commented 2 years ago

A little more debug, I exec'd into the repo server and dug around the helm dir. It looks like this dir just keep filling up w/o garbage cleanup

$ pwd && ls -lhart
/helm-working-dir/repository
total 1007M
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:41 'FW8OjaFXtOCLNl1z9yWC3w26G+E=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:41 'FW8OjaFXtOCLNl1z9yWC3w26G+E=-charts.txt'
drwxrwxrwx 3 root   root     24 Jul 21 12:41  ..
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:42 'BToeLjCwKLdsJsKUeNkIPrBVHqc=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:42 'BToeLjCwKLdsJsKUeNkIPrBVHqc=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:42 '4IQrVJWPg3zRdVZvW6FuKtEoxP8=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:42 '4IQrVJWPg3zRdVZvW6FuKtEoxP8=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:43 'ZpZ1FFuoD44mz+Ugl5JVi723nwU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:43 'ZpZ1FFuoD44mz+Ugl5JVi723nwU=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:43 '0xtUDLmeU5-W-O7hMJSOC8vIyfc=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:43 '0xtUDLmeU5-W-O7hMJSOC8vIyfc=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:44 '6OGmDfcjTv5EqVbNNbxRyiFFAGo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:44 '6OGmDfcjTv5EqVbNNbxRyiFFAGo=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:44 'K-IXEcXwEwYNNV3X5w1oUKhxXwQ=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:44 'K-IXEcXwEwYNNV3X5w1oUKhxXwQ=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:44 'B69qt-yN222ajenhrghE+bmjD1Q=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:44 'B69qt-yN222ajenhrghE+bmjD1Q=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:45 '7JAk0Vt7YauzCM4JGQObrSPQOVM=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:45 '7JAk0Vt7YauzCM4JGQObrSPQOVM=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:45 '9GllqTfg64HKVUzGz2ncMdUA9FM=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:45 '9GllqTfg64HKVUzGz2ncMdUA9FM=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:45 'Iv87trB2WTnGsji25NbKOCQDJTg=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:45 'Iv87trB2WTnGsji25NbKOCQDJTg=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:46 'k90+GP87wjozfVBEfht4ON8zanU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:46 'k90+GP87wjozfVBEfht4ON8zanU=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:47 'pWFL4m0Mfb7CDcXOd8H3lmswFjQ=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:47 'pWFL4m0Mfb7CDcXOd8H3lmswFjQ=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:48 'EvImsOEwby5XxOijnvsLJh7-hGQ=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:48 'EvImsOEwby5XxOijnvsLJh7-hGQ=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:48 '7nTaDp5hSxlcqlrgRf79gjDBYUU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:48 '7nTaDp5hSxlcqlrgRf79gjDBYUU=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:48 'QBO1NmdirWnmAfwGI-lNFkkE90k=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:48 'QBO1NmdirWnmAfwGI-lNFkkE90k=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:48 'sfFaXaUklpjB4NzMdLrssEIj5EI=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:48 'sfFaXaUklpjB4NzMdLrssEIj5EI=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:49 'VsM1v5S8YB8GqzDQM64u9USi7lo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:49 'VsM1v5S8YB8GqzDQM64u9USi7lo=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:49 'D380GCTpwmG7F4VOwZX+sT2a5sM=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:49 'D380GCTpwmG7F4VOwZX+sT2a5sM=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:50 'ou3FJX8tars72IDa-LEKOZx5tUg=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:50 'ou3FJX8tars72IDa-LEKOZx5tUg=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:51 'b0t1pPIsfhdbbvuR+XvFbUN3Cjw=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:51 'b0t1pPIsfhdbbvuR+XvFbUN3Cjw=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:51 'P5Y6laEBMgZ98o7BNs8NJ36aMAU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:51 'P5Y6laEBMgZ98o7BNs8NJ36aMAU=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:52 'dfH3mXcA5DnZbK4a7uOhy+URGL8=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:52 'dfH3mXcA5DnZbK4a7uOhy+URGL8=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:52 'dMRzaDZNEF5DIxIjUPhIawzXjfU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:52 'dMRzaDZNEF5DIxIjUPhIawzXjfU=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:52 'X8rp9LV8xu3Hyp+TkbvyC2DDxLo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:52 'X8rp9LV8xu3Hyp+TkbvyC2DDxLo=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:53 '3okzF+YnNUS2M-rZtDfzf7SUeck=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:53 '3okzF+YnNUS2M-rZtDfzf7SUeck=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:53 '2fib8B7iIEHWucsArBcvVkWGyP4=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:53 '2fib8B7iIEHWucsArBcvVkWGyP4=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:54 '8k6Dmpe1MIAG8fWWQArPdU762GQ=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:54 '8k6Dmpe1MIAG8fWWQArPdU762GQ=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:54 '1vWnXdLh1JQ8LTnvH+4KsFGdDMw=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:54 '1vWnXdLh1JQ8LTnvH+4KsFGdDMw=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:55 'MsKdoSFhMptOFRYjND3bdfhyRz4=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:55 'MsKdoSFhMptOFRYjND3bdfhyRz4=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:56 'EFE1zoIFCAmbRAgVZbJmrswjEPA=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:56 'EFE1zoIFCAmbRAgVZbJmrswjEPA=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:56 'RwyS8KNewuNfs6Tu8BhXX084kSY=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:56 'RwyS8KNewuNfs6Tu8BhXX084kSY=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:56 'JPOvihl7LCA8wQ1rDgxK6NC5qKc=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:56 'JPOvihl7LCA8wQ1rDgxK6NC5qKc=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:58 'ntoQ9m427ll9aQeohxMIgqDs0bs=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:58 'ntoQ9m427ll9aQeohxMIgqDs0bs=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:59 '28L9fpQ6QGlykb5RpjaCyzJKqpo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:59 '28L9fpQ6QGlykb5RpjaCyzJKqpo=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 12:59 '0qwiLk992MLlGkHCo29yXGbD90g=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:59 '0qwiLk992MLlGkHCo29yXGbD90g=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:01 'ftcQ7LXl0ZPWUBd124U3Q9J+wRk=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:01 'ftcQ7LXl0ZPWUBd124U3Q9J+wRk=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:02 'lw16rkSBTwBnnMO1uG+OWF-vx5Y=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:02 'lw16rkSBTwBnnMO1uG+OWF-vx5Y=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:02 'CAK4nAsHyCB4BVoxRJjRYPgHzpY=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:02 'CAK4nAsHyCB4BVoxRJjRYPgHzpY=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:02 'x2-uIxTHkykD7epfSNTnUQg7+Ss=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:02 'x2-uIxTHkykD7epfSNTnUQg7+Ss=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:02 'RRtSazFxQNcuQvMRjaE8ySmd79c=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:02 'RRtSazFxQNcuQvMRjaE8ySmd79c=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:03 '7yXqQSDye7d59i+fQjYQoe8k+yk=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:03 '7yXqQSDye7d59i+fQjYQoe8k+yk=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:03 'x-dz4vA2487pOSV0hy8ahtVcBCo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:03 'x-dz4vA2487pOSV0hy8ahtVcBCo=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:04 'XiU2YhEmrL7gLBN+M9Z1ror9joA=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:04 'XiU2YhEmrL7gLBN+M9Z1ror9joA=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:05 '6NnE+gMVa79jQQs-J2FwMqzYF2E=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:05 '6NnE+gMVa79jQQs-J2FwMqzYF2E=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:05 'VBJ6U9WrKbstsRZUpUckvfJEokE=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:05 'VBJ6U9WrKbstsRZUpUckvfJEokE=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:05 'oWR2Hs3ASUUOzjyAnnk6kXk6iNY=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:05 'oWR2Hs3ASUUOzjyAnnk6kXk6iNY=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:05 'Yag1sKqkBSYtGtm1PqLyfL7PmEI=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:05 'Yag1sKqkBSYtGtm1PqLyfL7PmEI=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:06 'RHoK8cMZefv3RPc6vTArsswvybk=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:06 'RHoK8cMZefv3RPc6vTArsswvybk=-charts.txt'
-rw-r--r-- 1 argocd argocd  21M Jul 21 13:07 '9En7xQPnuHgSst4QTUOq9EazmBg=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:07 '9En7xQPnuHgSst4QTUOq9EazmBg=-charts.txt'
drwxr-xr-x 2 argocd argocd 8.0K Jul 21 13:07  .
crenshaw-dev commented 2 years ago

I'm not sure why I didn't notice it before, but these files are being cached by Helm and then not cleaned up. So maybe a Helm bug? Or maybe Helm intentionally keeps those files around.

A potential workaround would be to temporarily set the Helm working dir as some temp dir and then delete the temp dir after manifest generation. But then we might be missing out on some caching benefits of using a shared working dir.

Gonna dig into Helm code and see if it's intentionally not deleting these files.

crenshaw-dev commented 2 years ago

Found and tested the fix. https://github.com/helm/helm/pull/11172

Evidence from running the custom helm build in my repo server:

argocd@argocd-repo-server-7cbbc4494c-x27zs:~$ ls /helm-working-dir/repository/
'J99LUCjWiuC0eFfN66s5UWN-oa4=-charts.txt'  'J99LUCjWiuC0eFfN66s5UWN-oa4=-index.yaml'
argocd@argocd-repo-server-7cbbc4494c-x27zs:~$ ls /helm-working-dir/repository/
argocd@argocd-repo-server-7cbbc4494c-x27zs:~$ 

The file was created and then quickly deleted.

If this is urgent enough, we could build a Helm fork to bundle with Argo CD until they release the fix. Sounds like most folks have found work-arounds thought?

jmmclean commented 2 years ago

I do not have a workaround, but i can wait til a fix is released :) OOM killing basically resolves the issue

crenshaw-dev commented 2 years ago

Sweet! You could also build a custom Argo CD image and copy in the custom Helm binary. That's a lot of work though for something that can kinda fix itself with a restart.

jmmclean commented 2 years ago

agreed....i can wait :) yall do really well with cutting release for Argo, so im sure it won't take long! Thanks for digging in

also, i recognize you need your PR to be merged into helm repo, then you likely need to upgrade helm version in Argo (chore)