Open frantjc opened 2 years ago
@frantjc the ApplicationSet controller uses the GHE API to populate the output of the SCM Provider generator. Do you have samples of what the GHE API returns for those API calls when GHE is in maintenance mode?
My hope is that it would return a non-200 response code, and the ApplicationSet controller would refuse to proceed with reconciliation. But it sounds like either a 200 is returned, or the ApplicationSet controller doesn't check the response code.
Hi @crenshaw-dev! GitHub Enterprise appears to be properly reporting its "error" state when in Maintenance mode. Posting responses from the list repositories for organization endpoint (as GitHub's official npm
module @octokit/rest
refers to it) as I believe that is the important one in this case. I've modified them slightly to omit info about company, auth, etc.
Normal:
{
"data": ["I removed the repository objects here for brevity but there are up to 100 repositories here depending on the number of repositories in the organization"],
"status": 200,
"url": "https://github.mycorp.com/api/v3/orgs/myorg/repos",
"headers": {
"access-control-allow-origin": "*",
"access-control-expose-headers": "ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset",
"cache-control": "private, max-age=60, s-maxage=60",
"content-encoding": "gzip",
"content-security-policy": "default-src 'none'",
"content-type": "application/json; charset=utf-8",
"date": "Tue, 29 Nov 2022 17:01:02 GMT",
"etag": "W/\"2b6804769b3d2d550ccfb9664bb8f2a50049932853940a40c4aa18ca097d12c3\"",
"link": "<https://github.mycorp.com/api/v3/organizations/39/repos?page=2>; rel=\"next\", <https://github.mycorp.com/api/v3/organizations/39/repos?page=29>; rel=\"last\"",
"referrer-policy": "origin-when-cross-origin, strict-origin-when-cross-origin",
"server": "GitHub.com",
"strict-transport-security": "max-age=31536000; includeSubdomains",
"transfer-encoding": "chunked",
"vary": "Accept, Authorization, Cookie, X-GitHub-OTP",
"x-accepted-oauth-scopes": "",
"x-content-type-options": "nosniff",
"x-frame-options": "deny",
"x-github-enterprise-version": "3.7.0",
"x-github-media-type": "github.v3; format=json",
"x-github-request-id": "a1c1ce0c-6e87-4aa3-8e00-36c077555ef1",
"x-runtime-rack": "0.477439",
"x-xss-protection": "0"
}
}
Maintenance:
{
"data": "I removed HTML from here, can paste screenshot of it rendered if necessary but GitHub currently isn't letting me upload it",
"url": "https://github.mycorp.com/api/v3/orgs/myorg/repos",
"status": 503,
"headers": {
"content-length": "702301",
"content-type": "text/html",
"date": "Tue, 29 Nov 2022 17:17:07 GMT",
"etag": "6372c72f-ab75d",
"server": "GitHub.com"
}
}
Thanks! I bet this needs to check the response status. I'm guessing the GitHub client doesn't return an error for a non-200 response code. https://github.com/argoproj/argo-cd/blob/362abff610d81a4878e53cecb78dcb2902776f5b/applicationset/services/scm_provider/github.go#L48
I suspected the same, though I was looking here: https://github.com/argoproj/argo-cd/blob/362abff610d81a4878e53cecb78dcb2902776f5b/applicationset/services/scm_provider/github.go#L73
I see that this particular function call appears to potentially return information about the request outside of just the parsed body (the resp
variable)--perhaps that contains the HTTP status code that could be checked?
Hi @crenshaw-dev Any updates for this issue? We have a customer also run into same issue where applicationSet with SCM Provider is not working while other type applicationSet is working.
Checklist:
argocd version
.Describe the bug
When you create an ApplicationSet that uses the SCM Provider generator pointed at an Organization in GitHub Enterprise and it generates an Application, when that GitHub Enterprise instance goes into maintenance mode, then the ApplicationSet deletes all of the Applications it had previously generated.
Presumably this functionality extends beyond the scope of GitHub Enterprise--any unexpected error between the SCM Provider and its configured backend could result in all of its generated Applications being deleted.
To Reproduce
Expected behavior
When GitHub Enterprise is put into maintenance mode (or any other kind of unexpected state e.g. network issues), I'd expect an ApplicationSet using the SCM Provider generator that is pointed to said GitHub Enterprise instance to notice that something out of the ordinary is going on, perhaps mark the ApplicationSet as unhealthy and, most importantly, not delete all of the Applications that said ApplicationSet had generated.
Version
I do not have access to the
argocd
binary in question, but the version from the UI is v2.3.3