Closed vvoinea-gpsw closed 2 years ago
@vvoinea-gpsw there were some small changes around chart caching in recent 2.1.x versions. Are you able to upgrade to 2.1.12? I don't have much confidence that it will solve the problem, but I think for relatively effort it's a good starting point.
@crenshaw-dev I just jumped ahead to 2.2.0 as I saw some cache changed in there too. Seeing the same caching that can't be controlled by any of the parameters: --default-cache-expiration duration --repo-cache-expiration --revision-cache-expiration
I don't understand why nobody else is seeing this? This behavior surely wasn't present on 2.0.0
@vvoinea-gpsw I think the caching changes I'm thinking of were later in the 2.2 series. If you're able to go to the latest patch, that would be good.
But not really hopeful for that change. I'll need to come back to this next week because I'll need to read the caching code to get a good feeling for how to reproduce the issue.
Thank you @crenshaw-dev for looking into this, I'll also be off next week but looking forward to seeing what you find. I will try to upgrade to the latest version and look for any rotation of the caches. One more details the yaml files being created every 3 minutes have the FULL yamls of ALL the helm charts in our private repo and every 3 minutes multiple files are created
Hi @crenshaw-dev I upgraded to the latest release 2.3.3 where I still see ta large number of <some-hash>-index.yaml
and <same-hash>-charts.txt
being stored every 3 minutes. The odd thing is that at each interval we get multiple file for each extention and they all have respecively the same content (checked md5 hashes).
Going to monitor it for a few days to see if the variables actually take effect and rotate these cached files.
Please let me know if you had time to replicate this issue
Hi, we experienced this issue last week. In our case, we were changing expired private git token and had to go in and restart repo server instances. After the restart, the emptyDir volume started filling, with new <hash>-charts.txt
and <hash>-index.yaml
files appearing every ~3min.
We noticed that reposerver had error logs about some helm charts not being found (specifically redis, external-dns, etc. from bitnami helm repositories). We were depending on old chart versions which seem to have already been removed from the (updated) index in the repos, thus reposerver wasn't able to find them (while they were actually deployed by Argo in the cluster already).
After bumping the applications to available helm chart versions, we stopped seeing these errors and the files stopped appearing / volume stopped getting filled excessively.
So it seems to be limited to when the Helm chart no longer exists. Sounds like there's some cleanup logic that gets missed in that case.
Please 👍 this issue if it's affecting you. I don't have time to investigate at the moment, but more thumbs up will help the issue get attention. :-)
Encountered the same problem with a chart removed by bitnami: See issue #9665 where I detailed the behaviour of the bug on my end
I believe I am hitting this issue as well, as this dir fills up, my memory usage gets really high too. doing a rolling restart of the argocd-repo-server
resolves the issue. Also, resolving helm chart issues that cause an unknown status mitigates the symptoms.
thanks to @crenshaw-dev for all your help along the way!!!
CNCF link https://cloud-native.slack.com/archives/C01TSERG0KZ/p1655221266746929
running on v2.3.3
@crenshaw-dev this really needs some TLC. It occurs when at least one helm app is in an unknown state. for my group, we are self service so this happens from time to time, especially as people learn. The below is my memory usage...Im having to do a kubectl rollout restart deployment argocd-repo-server
to make the memory usage happy again
@jmmclean yikes. I'm still short on time. If anyone has repo-server logs mentioning "LOG_LEVEL
set to debug. :-)
Hello, we had the same issue for our redis. We changed the repo to: "https://raw.githubusercontent.com/bitnami/charts/pre-2022/bitnami" This temporary resolved the issue. We will now plan to update the helm chart
@crenshaw-dev I enabled debug
and got the below logs (didnt seem too helpful, but I will let you decide that)
Command to acquire logs:
stern argocd-repo | grep -v "with code OK" | grep -v "manifest cache hit" | grep -v "' resolved to"
Logs (scrubbed of company PII):
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:48:59Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://charts.ops.mycompany.com,Path:,TargetRevision:911,Helm:&ApplicationSourceHelm{ValueFiles:[dev.yaml],Parameters:[]HelmParameter{HelmParameter{Name:global.container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},HelmParameter{Name:container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:af-execution-api,}/911"
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:48:59Z" level=info msg="helm pull --destination /tmp/43235af3-354b-437e-b356-1706cf3898ce --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api" dir= execID=c25b5
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:07Z" level=debug duration=8.173533236s execID=c25b5
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:07Z" level=error msg="`helm pull --destination /tmp/43235af3-354b-437e-b356-1706cf3898ce --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" execID=c25b5
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:07Z" level=info msg=Trace args="[helm pull --destination /tmp/43235af3-354b-437e-b356-1706cf3898ce --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api]" dir= operation_name="exec helm" time_ms=8173.745909000001
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:07Z" level=error msg="finished unary call with code Unknown" error="`helm pull --destination /tmp/43235af3-354b-437e-b356-1706cf3898ce --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" grpc.code=Unknown grpc.method=GenerateManifest grpc.service=repository.RepoServerService grpc.start_time="2022-07-21T12:48:59Z" grpc.time_ms=8175.766 span.kind=server system=grpc
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:34Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://charts.ops.mycompany.com,Path:,TargetRevision:911,Helm:&ApplicationSourceHelm{ValueFiles:[dev.yaml],Parameters:[]HelmParameter{HelmParameter{Name:global.container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},HelmParameter{Name:container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:af-execution-api,}/911"
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:34Z" level=info msg="helm pull --destination /tmp/18d17f44-e95c-4590-b4c8-7f4c2fd498e0 --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api" dir= execID=f1f5d
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:42Z" level=debug duration=8.068494871s execID=f1f5d
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:42Z" level=error msg="`helm pull --destination /tmp/18d17f44-e95c-4590-b4c8-7f4c2fd498e0 --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" execID=f1f5d
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:42Z" level=info msg=Trace args="[helm pull --destination /tmp/18d17f44-e95c-4590-b4c8-7f4c2fd498e0 --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api]" dir= operation_name="exec helm" time_ms=8068.727577000001
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:49:42Z" level=error msg="finished unary call with code Unknown" error="`helm pull --destination /tmp/18d17f44-e95c-4590-b4c8-7f4c2fd498e0 --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" grpc.code=Unknown grpc.method=GenerateManifest grpc.service=repository.RepoServerService grpc.start_time="2022-07-21T12:49:34Z" grpc.time_ms=8070.155 span.kind=server system=grpc
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:49Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://charts.ops.mycompany.com,Path:,TargetRevision:911,Helm:&ApplicationSourceHelm{ValueFiles:[dev.yaml],Parameters:[]HelmParameter{HelmParameter{Name:global.container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},HelmParameter{Name:container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:af-execution-api,}/911"
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:49Z" level=info msg="helm pull --destination /tmp/f8333e7b-23a9-4ecf-a791-1d2a9401498f --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api" dir= execID=a5633
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:58Z" level=debug duration=8.903038459s execID=a5633
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:58Z" level=error msg="`helm pull --destination /tmp/f8333e7b-23a9-4ecf-a791-1d2a9401498f --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" execID=a5633
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:58Z" level=info msg=Trace args="[helm pull --destination /tmp/f8333e7b-23a9-4ecf-a791-1d2a9401498f --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api]" dir= operation_name="exec helm" time_ms=8903.253155
argocd-repo-server-86bd7fdcd4-vtlq7 argocd-repo-server time="2022-07-21T12:49:58Z" level=error msg="finished unary call with code Unknown" error="`helm pull --destination /tmp/f8333e7b-23a9-4ecf-a791-1d2a9401498f --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" grpc.code=Unknown grpc.method=GenerateManifest grpc.service=repository.RepoServerService grpc.start_time="2022-07-21T12:49:49Z" grpc.time_ms=8905.308 span.kind=server system=grpc
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:04Z" level=info msg="manifest cache miss: &ApplicationSource{RepoURL:https://charts.ops.mycompany.com,Path:,TargetRevision:911,Helm:&ApplicationSourceHelm{ValueFiles:[dev.yaml],Parameters:[]HelmParameter{HelmParameter{Name:global.container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},HelmParameter{Name:container.repository,Value:01234567890.dkr.ecr.us-east-1.amazonaws.com/mycompany/af-execution-api:911,ForceString:false,},},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:af-execution-api,}/911"
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:04Z" level=info msg="helm pull --destination /tmp/9380dea1-9251-44de-9d48-09b122597aed --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api" dir= execID=b5546
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:12Z" level=debug duration=7.858755533s execID=b5546
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:12Z" level=error msg="`helm pull --destination /tmp/9380dea1-9251-44de-9d48-09b122597aed --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api` failed exit status 1: Error: chart \"af-execution-api\" version \"911\" not found in https://charts.ops.mycompany.com repository" execID=b5546
argocd-repo-server-86bd7fdcd4-cvkmz argocd-repo-server time="2022-07-21T12:50:12Z" level=info msg=Trace args="[helm pull --destination /tmp/9380dea1-9251-44de-9d48-09b122597aed --version 911 --username ****** --password ****** --repo https://charts.ops.mycompany.com af-execution-api]" dir= operation_name="exec helm" time_ms=7859.018099
Note: I had to do a rolling restart for the log level to take effect, so the memory issue is temporarily mitigated, however it will climb since there is a helm chart in an Unknown status
A little more debug, I exec'd into the repo server and dug around the helm dir. It looks like this dir just keep filling up w/o garbage cleanup
$ pwd && ls -lhart
/helm-working-dir/repository
total 1007M
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:41 'FW8OjaFXtOCLNl1z9yWC3w26G+E=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:41 'FW8OjaFXtOCLNl1z9yWC3w26G+E=-charts.txt'
drwxrwxrwx 3 root root 24 Jul 21 12:41 ..
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:42 'BToeLjCwKLdsJsKUeNkIPrBVHqc=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:42 'BToeLjCwKLdsJsKUeNkIPrBVHqc=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:42 '4IQrVJWPg3zRdVZvW6FuKtEoxP8=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:42 '4IQrVJWPg3zRdVZvW6FuKtEoxP8=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:43 'ZpZ1FFuoD44mz+Ugl5JVi723nwU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:43 'ZpZ1FFuoD44mz+Ugl5JVi723nwU=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:43 '0xtUDLmeU5-W-O7hMJSOC8vIyfc=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:43 '0xtUDLmeU5-W-O7hMJSOC8vIyfc=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:44 '6OGmDfcjTv5EqVbNNbxRyiFFAGo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:44 '6OGmDfcjTv5EqVbNNbxRyiFFAGo=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:44 'K-IXEcXwEwYNNV3X5w1oUKhxXwQ=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:44 'K-IXEcXwEwYNNV3X5w1oUKhxXwQ=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:44 'B69qt-yN222ajenhrghE+bmjD1Q=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:44 'B69qt-yN222ajenhrghE+bmjD1Q=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:45 '7JAk0Vt7YauzCM4JGQObrSPQOVM=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:45 '7JAk0Vt7YauzCM4JGQObrSPQOVM=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:45 '9GllqTfg64HKVUzGz2ncMdUA9FM=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:45 '9GllqTfg64HKVUzGz2ncMdUA9FM=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:45 'Iv87trB2WTnGsji25NbKOCQDJTg=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:45 'Iv87trB2WTnGsji25NbKOCQDJTg=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:46 'k90+GP87wjozfVBEfht4ON8zanU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:46 'k90+GP87wjozfVBEfht4ON8zanU=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:47 'pWFL4m0Mfb7CDcXOd8H3lmswFjQ=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:47 'pWFL4m0Mfb7CDcXOd8H3lmswFjQ=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:48 'EvImsOEwby5XxOijnvsLJh7-hGQ=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:48 'EvImsOEwby5XxOijnvsLJh7-hGQ=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:48 '7nTaDp5hSxlcqlrgRf79gjDBYUU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:48 '7nTaDp5hSxlcqlrgRf79gjDBYUU=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:48 'QBO1NmdirWnmAfwGI-lNFkkE90k=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:48 'QBO1NmdirWnmAfwGI-lNFkkE90k=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:48 'sfFaXaUklpjB4NzMdLrssEIj5EI=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:48 'sfFaXaUklpjB4NzMdLrssEIj5EI=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:49 'VsM1v5S8YB8GqzDQM64u9USi7lo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:49 'VsM1v5S8YB8GqzDQM64u9USi7lo=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:49 'D380GCTpwmG7F4VOwZX+sT2a5sM=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:49 'D380GCTpwmG7F4VOwZX+sT2a5sM=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:50 'ou3FJX8tars72IDa-LEKOZx5tUg=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:50 'ou3FJX8tars72IDa-LEKOZx5tUg=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:51 'b0t1pPIsfhdbbvuR+XvFbUN3Cjw=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:51 'b0t1pPIsfhdbbvuR+XvFbUN3Cjw=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:51 'P5Y6laEBMgZ98o7BNs8NJ36aMAU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:51 'P5Y6laEBMgZ98o7BNs8NJ36aMAU=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:52 'dfH3mXcA5DnZbK4a7uOhy+URGL8=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:52 'dfH3mXcA5DnZbK4a7uOhy+URGL8=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:52 'dMRzaDZNEF5DIxIjUPhIawzXjfU=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:52 'dMRzaDZNEF5DIxIjUPhIawzXjfU=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:52 'X8rp9LV8xu3Hyp+TkbvyC2DDxLo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:52 'X8rp9LV8xu3Hyp+TkbvyC2DDxLo=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:53 '3okzF+YnNUS2M-rZtDfzf7SUeck=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:53 '3okzF+YnNUS2M-rZtDfzf7SUeck=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:53 '2fib8B7iIEHWucsArBcvVkWGyP4=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:53 '2fib8B7iIEHWucsArBcvVkWGyP4=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:54 '8k6Dmpe1MIAG8fWWQArPdU762GQ=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:54 '8k6Dmpe1MIAG8fWWQArPdU762GQ=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:54 '1vWnXdLh1JQ8LTnvH+4KsFGdDMw=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:54 '1vWnXdLh1JQ8LTnvH+4KsFGdDMw=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:55 'MsKdoSFhMptOFRYjND3bdfhyRz4=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:55 'MsKdoSFhMptOFRYjND3bdfhyRz4=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:56 'EFE1zoIFCAmbRAgVZbJmrswjEPA=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:56 'EFE1zoIFCAmbRAgVZbJmrswjEPA=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:56 'RwyS8KNewuNfs6Tu8BhXX084kSY=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:56 'RwyS8KNewuNfs6Tu8BhXX084kSY=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:56 'JPOvihl7LCA8wQ1rDgxK6NC5qKc=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:56 'JPOvihl7LCA8wQ1rDgxK6NC5qKc=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:58 'ntoQ9m427ll9aQeohxMIgqDs0bs=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:58 'ntoQ9m427ll9aQeohxMIgqDs0bs=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:59 '28L9fpQ6QGlykb5RpjaCyzJKqpo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:59 '28L9fpQ6QGlykb5RpjaCyzJKqpo=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 12:59 '0qwiLk992MLlGkHCo29yXGbD90g=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 12:59 '0qwiLk992MLlGkHCo29yXGbD90g=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:01 'ftcQ7LXl0ZPWUBd124U3Q9J+wRk=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:01 'ftcQ7LXl0ZPWUBd124U3Q9J+wRk=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:02 'lw16rkSBTwBnnMO1uG+OWF-vx5Y=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:02 'lw16rkSBTwBnnMO1uG+OWF-vx5Y=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:02 'CAK4nAsHyCB4BVoxRJjRYPgHzpY=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:02 'CAK4nAsHyCB4BVoxRJjRYPgHzpY=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:02 'x2-uIxTHkykD7epfSNTnUQg7+Ss=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:02 'x2-uIxTHkykD7epfSNTnUQg7+Ss=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:02 'RRtSazFxQNcuQvMRjaE8ySmd79c=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:02 'RRtSazFxQNcuQvMRjaE8ySmd79c=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:03 '7yXqQSDye7d59i+fQjYQoe8k+yk=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:03 '7yXqQSDye7d59i+fQjYQoe8k+yk=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:03 'x-dz4vA2487pOSV0hy8ahtVcBCo=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:03 'x-dz4vA2487pOSV0hy8ahtVcBCo=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:04 'XiU2YhEmrL7gLBN+M9Z1ror9joA=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:04 'XiU2YhEmrL7gLBN+M9Z1ror9joA=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:05 '6NnE+gMVa79jQQs-J2FwMqzYF2E=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:05 '6NnE+gMVa79jQQs-J2FwMqzYF2E=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:05 'VBJ6U9WrKbstsRZUpUckvfJEokE=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:05 'VBJ6U9WrKbstsRZUpUckvfJEokE=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:05 'oWR2Hs3ASUUOzjyAnnk6kXk6iNY=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:05 'oWR2Hs3ASUUOzjyAnnk6kXk6iNY=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:05 'Yag1sKqkBSYtGtm1PqLyfL7PmEI=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:05 'Yag1sKqkBSYtGtm1PqLyfL7PmEI=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:06 'RHoK8cMZefv3RPc6vTArsswvybk=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:06 'RHoK8cMZefv3RPc6vTArsswvybk=-charts.txt'
-rw-r--r-- 1 argocd argocd 21M Jul 21 13:07 '9En7xQPnuHgSst4QTUOq9EazmBg=-index.yaml'
-rw-r--r-- 1 argocd argocd 3.8K Jul 21 13:07 '9En7xQPnuHgSst4QTUOq9EazmBg=-charts.txt'
drwxr-xr-x 2 argocd argocd 8.0K Jul 21 13:07 .
I'm not sure why I didn't notice it before, but these files are being cached by Helm and then not cleaned up. So maybe a Helm bug? Or maybe Helm intentionally keeps those files around.
A potential workaround would be to temporarily set the Helm working dir as some temp dir and then delete the temp dir after manifest generation. But then we might be missing out on some caching benefits of using a shared working dir.
Gonna dig into Helm code and see if it's intentionally not deleting these files.
Found and tested the fix. https://github.com/helm/helm/pull/11172
Evidence from running the custom helm build in my repo server:
argocd@argocd-repo-server-7cbbc4494c-x27zs:~$ ls /helm-working-dir/repository/
'J99LUCjWiuC0eFfN66s5UWN-oa4=-charts.txt' 'J99LUCjWiuC0eFfN66s5UWN-oa4=-index.yaml'
argocd@argocd-repo-server-7cbbc4494c-x27zs:~$ ls /helm-working-dir/repository/
argocd@argocd-repo-server-7cbbc4494c-x27zs:~$
The file was created and then quickly deleted.
If this is urgent enough, we could build a Helm fork to bundle with Argo CD until they release the fix. Sounds like most folks have found work-arounds thought?
I do not have a workaround, but i can wait til a fix is released :) OOM killing basically resolves the issue
Sweet! You could also build a custom Argo CD image and copy in the custom Helm binary. That's a lot of work though for something that can kinda fix itself with a restart.
agreed....i can wait :) yall do really well with cutting release for Argo, so im sure it won't take long! Thanks for digging in
also, i recognize you need your PR to be merged into helm repo, then you likely need to upgrade helm version in Argo (chore)
Checklist:
argocd version
.Describe the bug
After upgrading from ArgoCD 1.8 to 2.1.8 we are seeing the argocd-repo-server pods filling up the empyDir helm-working-dir to the amount of 100GB in 2 days -> the node disk is being filled (empyDir maps to host disk OR RAM) and pods are getting evicted. Tried to limit the cache using the obvious parameters:
reposerver.default.cache.expiration=1h
orreposerver.repo.cache.expiration=2h
But neither can limit the creation of new<some-hash>charts.yaml
and<some-hash>index.yaml
files in the/helm-working-dir/repository
on the pod every 3 minutes. Since empyDir can also map to RAM we have also seen higher memory usages similar to this https://github.com/argoproj/argo-cd/issues/8698To Reproduce
Install 2.1.8 and monitor disk and memory usage
Expected behavior There would be some parameter to limit the rotation of these cached charts Also if the volume is known to fill up this quickly (due to a perfect storm of large chart repo and small node disk size) could we limit the volume size by changing the volume setup out of the box ?
Screenshots
Version
Logs