argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.88k stars 5.45k forks source link

Reduce load on private registry by keeping charts tgz on 'git clean' #16725

Open raz-amir opened 10 months ago

raz-amir commented 10 months ago

Summary

What change you think needs making.

The number of dependency charts pulls can be significantly reduced, by avoiding the cleaning of the charts folder when the git repo is refreshed, thus, avoiding the need to call helm dependency build over and over if the dependency chart version doesn't change.

Motivation

Please give examples of your use case, e.g. when would you use this.

We have an argocd server with about 200 applications, and almost all apps depend on the same version of one of our charts hosted on a private OCI registry. The charts of these apps are all located on the same git repository. We see numerous pulls of the dependency chart when the git repo advances, causing load on the private registry, and sometimes we get transient Unknown status that fixes on its own on the next reconciliation cycle. Looking at the logs we see multiple helm dependency build operations, each pulling the dependency chart per app. Most of those apps depend on the same version of the base chart. I believe that in our case, and hopefully can be useful for others, this number can be significantly reduced by avoiding the cleaning of the*.tgz files from the charts/ folders. I am referring to the git clean command that takes place when the branch refreshes.

Proposal

How do you think this should be implemented?

I would like to suggest an optional flag that will add a flag to the current git clean -ffdx command: -e **/charts/**/*.tgz. Something like --keep-build-dependency-charts. The full command will look something like this: git clean -ffdx -e **/charts/**/*.tgz.

Edit: Also, run helm dependency list to decide if helm dependency build should be executed before helm template since helm template doesn't check if the dependency versions in Chart.yaml match the files in charts/ folder.

If this suggestion is acceptable, I can submit a PR for that.

crenshaw-dev commented 10 months ago

Wouldn't the repo-server eventually fill up with old chart tgzs if we never clean them up?

raz-amir commented 10 months ago

@crenshaw-dev, thank you for looking into this. I believe its not an issue, because if the dependency charts versions do change, then helm dependency build will run and clean the unneeded tgz files.

crenshaw-dev commented 10 months ago

That makes sense... I'd want to make sure we have a test in place to confirm that behavior, but conceptually this sounds good to me!

raz-amir commented 10 months ago

Great! Just to makes sure I got you right: are you going to check if there are existing tests that cover that and update with your findings? Or are you expecting to see a test in the new PR that covers that?

crenshaw-dev commented 10 months ago

I would expect a new test, unless you encounter one that already exists. It's also fine if we end up double-testing something. :-)

raz-amir commented 10 months ago

Got you :)

raz-amir commented 10 months ago

Hi @crenshaw-dev, I have opened https://github.com/argoproj/argo-cd/pull/16750 to implement this feature. Are you going to review it? Here are some notes about the PR related to our discussion here:

  1. I have added a test to verify that helm dependency build deletes unneeded dependency chart version here https://github.com/ramir-savvy/argo-cd/blob/5246b404ecfed2e0523f26de8f9b57e50b2d7541/util/git/client_test.go#L330 (under check that the previous version basechart tgz file does not exist in the parentchart charts directory comment).
  2. It appears that helm template doesn't behave as I originally thought: I thought it fails if the Chart.yaml points to a different version than the one located under charts/ but it isn't the case. It succeeds regardless of the version. It can be higher or lower and it doesn't matter as long there is some version under charts/
  3. Because of the above, this new feature of not cleaning the dependency charts tgz cannot live on its own. Because today's code flow, executes helm template first, and only if it fails, it will run the helm dependency build to update the dependency charts. I guess this is useful for performance in case the chart doesn't have any dependencies. But this cannot work with this new feature, because old dependencies kept, will cause helm template to wrongly succeed.
  4. Therefore, I have added a call to helm package list, and I am parsing its output (I couldn't find a more programmatic way to do so, and as I saw that helm output parsing is done for other commands as well). Based on that command, decide if need to run helm dependency build before running helm template. This new flow will take place only if this new feature is enabled, otherwise the previous flow runs.

Thank you in advance

blakepettersson commented 9 months ago

@ramir-savvy In regards to

  1. It appears that helm template doesn't behave as I originally thought: I thought it fails if the Chart.yaml points to a different version than the one located under charts/ but it isn't the case. It succeeds regardless of the version. It can be higher or lower and it doesn't matter as long there is some version under charts/

Does this behaviour change if there's an accompanying Chart.lock in the repository? (This likely can only currently be tested with a Helm chart in Git)

raz-amir commented 9 months ago

Hi @blakepettersson, I tested different scenarios that I will list below with Chart.lock and the bottom line is that helm template ignores that file and it only cares if there is a dependency with that name under charts/ regardless of its version.

Scenarios:

  1. I have a chart with a dependency in another chart in version 1.0.0. helm template doesn't run with this error:
    Error: An error occurred while checking for chart dependencies. You may need to run `helm dependency build` to fetch missing dependencies: found in Chart.yaml, but missing in charts/ directory
  2. I run helm dependency build and now helm template runs successfully as expected.
  3. Deleting the Chart.lock doesn't impact helm template behavior (I restore the Chart.lock file).
  4. I change the dependency version in Chart.yaml to 2.0.0 and helm template runs successfully, unfortunately, although helm dependency list shows wrong version.
  5. I replace the packed tgz chart under charts/ with version 2.0.0 (while Chart.lock is at 1.0.0 and Chart.yaml at 2.0.0) and helm template runs successfully, unfortunately.
  6. Mind that when Chart.lock isn't in sync with Chart.yaml, the helm dependency build command (which is the command used by argo-cd) fails with the error below, and instead need to run helm dependency update (which isn't part of the argo-cd flow today):
    Error: the lock file (Chart.lock) is out of sync with the dependencies file (Chart.yaml). Please update the dependencies

I wrote my conclusion at the beginning of this comment. Let me know if I should run additional tests.