Open IvanovOleg opened 2 years ago
Are there a lot of repos in the organization. The SCM generator loops over repos to list branches.
@crenshaw-dev A lot. It looks like SCM retrieves everything and then filters which makes it extremely ineffective
It should apply your repositoryMatch
filter before listing branches, which should help some.
After listing and filtering repos by name, it'll loop over the remaining repos and list branches.
After listing branches, it will filter by the branchMatch
filter.
Then for each remaining branch, it'll make an API call to filter by pathsExist
.
The regex filters are your friends. They should help minimize the API calls by a lot.
@crenshaw-dev I my example, I select only one repository and a single branch, but it drains all 5000 requests in less then 10 minutes.
Oh wait. I just noticed the error message is about the Pull Request Generator? Are you sure the SCM generator is the issue? It shouldn't be hitting the PR API.
@crenshaw-dev I guess I copied from the wrong app. Anyway I tried to remove all other appsets and apps and leave just SCM. It still drains all request. And when all rate limit is exceeded everything stops working.
If I understand correctly, you have it filtered down to one repo.
How many branches match the branch pattern for that one repo?
@crenshaw-dev that repo contains main branch and deploy branch. That's all. It's a POC, not a real app
Wow. I'd have to dig into the code to get a good sense what's going wrong here. The generator code needs more debug logging. There's really no way to debug this without seeing what calls are being made.
I did a bit of debugging , but some how I don't see much of calls or rate limit being although my org had only 2 repo and 4 branches.
I think Key to debug it successfully would be to have a large organization, @IvanovOleg : Can u share size of repo and number of branches , we could replicate and add filter
As in code I see that it filters once all repo's are listed
@rishabh625 ~1000 repositories with at least 10 branches in each
Any updates?
Any updates?
Not yet @IvanovOleg , will post you soon
Perhaps it would help if ApplicationSet SCM generator used the existing repo secrets so it could pick up the GitHub App secret?
Instead of having to reference a separate token secret.
The org app could be set to allow read for all repos and have higher rate limits: https://docs.github.com/en/developers/apps/building-github-apps/rate-limits-for-github-apps#default-server-to-server-rate-limits-for-githubcom
See where the code already has GitHubApp Auth: https://github.com/argoproj/argo-cd/search?q=GithubAppInstallationId
@jetersen I think that might be what this PR does: https://github.com/argoproj/argo-cd/pull/10092
Hoping to review it today.
@crenshaw-dev thanks for cross linking, I was searching for issues and PRs but found this one first and dropped my comment here 😅
@jetersen Perhaps it would help if ApplicationSet SCM generator used the existing repo secrets so it could pick up the GitHub App secret?
I think it would help in terms that using GitHub App secret may allow to make more api calls compared to PAT, although imo we're not fixing the root issue which is that we're overusing api calls when we have scmProvider
regexed into a repo + branch.
I'm having the same issue with Bitbucket Cloud SCM:
- scmProvider:
bitbucket:
# The workspace id (slug).
owner: "my-bitbucket-org"
# The user to use for basic authentication with an app password.
user: "my-bot-user"
# If true, scan every branch of every repository. If false, scan only the main branch. Defaults to false.
allBranches: false
# Reference to a Secret containing an app password.
appPasswordRef:
secretName: argocd-repo-creds-bitbucket
key: password
We have around 100 repos with few branches in each and application-set controller gets 429 :( In my case filtering by repo name is not an option. Shouldn't app set controller have some parameter that controls the number of calls per second?
Maybe it would be a good idea to allow changing query parameters. For example github allows filtering on topic in the api request. This would drastically reduce the amount of calls for us.
Proving to be serious issue for us. We'd like to use branchMatch
...
We seem to have the same problem. Are there any news on this?
I'm also facing the problem. My org has around 800 repositories.
GET https://api.github.com/orgs/<my org>/repos?per_page=100: 403 API rate limit exceeded for installation ID <ID>
Allowing adding the filtered topics to the query would help in short term until the organization has many repositories that must be discovered using the topic (argocd-discoverable
, for example). Would be possible to create some delay between the iterations? Seems that there's no waiting between the calls made to GitHub.
Another possibility would be drop the branchMatch. I think the idea of branch belongs to Git and not to the SCM, since they don't provide an API that allow us to filter repositories that contains a list of branches. Maybe this could be achieved by using a matrix of SCM + Git generators.
So I ran into this problem and decided to dig into the code. I believe the issue is that there is no precedence and logical operation (AND) to the rules given in a filter.
I too had a similar filter:
filters:
- repositoryMatch: ^my-repo
pathsExist: [kustomize/overlays/development]
branchMatch: ^deploy
Looking at the code I determined that it evaluates each of the conditions individually. It doesn't look for repos that match ^my-repo
and then further filter by pathsExist
and branchMatch
.
If I removed the pathsExist
and branchMatch
I did not experience 429s.
@cyrus-mc I believe this may be the fix: https://github.com/argoproj/argo-cd/pull/14465
I'm facing the same issue. Can't we just control the polling interval and set something more than 60s ? I tried multiple option and looked into the code but didn't find any parameter that allows doing that.
Checklist:
argocd version
.Describe the bug I have an ApplicationSet with SCM generator configured to poll GitHub repositories. In my simple case I just poll one repo branch. This simple configuration reaches the GitHub API Request limit of 5000 requests in 10 minutes. This fact makes SCM generator completely useless.
To Reproduce
Expected behavior Modest API request usage
Screenshots
Version
Logs