Closed Ginxo closed 1 year ago
so IIUC, the problem is the parallel run of checks right ?
I mean, from Drools8 repository, basically for kie-ci4
user, we would have 6 different parallel checks and each of them would make those calls (list would be different)
If now we setup a new environment as default PR checks, that means 12 different parallel checks ...
One solution, instead of having parallel checks would be to do a fdb instead, which means only one check per environment. But the issue is that the build time would increase a lot...
btw, how is done the checkout ? @Ginxo @shubhbapna
checkout is done in parallel by default, but there is a chance to sequentially do it just by adding skipParallelCheckout
flag either from CLI --skipParallelCheckout
see CLI inputs or GHA workflow input
my point is more:
@radtriste all these github api calls are used to gather the checkout information. Once we have that, we simply use git
to clone, merge etc. We don't use the token to clone
ok but those still are GH calls made with the token or ?
ok but those still are GH calls made with the token or ?
all GH APIs calls are done through the same octokit instance where GH token is added in case it is present, so yes... they are
so that makes more GH calls than the above ones ;)
@Ginxo I think Tristan is asking about the actual cloning of the repository. We don't use octokit for that we use the actual command line git. I am not sure if the git cli internally makes GH call (it probably doesn't) but either way that is again something we cannot avoid
I was thinking of a different approach to resolve this (I had briefly mentioned this to Tristan before):
Given most of our projects are public, I think we can use the github api without providing a token since the info we usually try to get is anyways public. (I tested a sample github api call to get list of pull requests without a token and it worked)
I was thinking of implementing a retry method for all our GH calls. We make the api call without a token. If it works great. If it doesn't and the error is 401 or 403 then we use the token and make the call again.
For our case all our GH calls should work without falling back to using token. This will greatly reduce the number of GH calls we make without having us to worry about the number of parallel checks running (at least for now). Wdyt? @Ginxo @radtriste
I was thinking of a different approach to resolve this (I had briefly mentioned this to Tristan before):
Given most of our projects are public, I think we can use the github api without providing a token since the info we usually try to get is anyways public. (I tested a sample github api call to get list of pull requests without a token and it worked)
I was thinking of implementing a retry method for all our GH calls. We make the api call without a token. If it works great. If it doesn't and the error is 401 or 403 then we use the token and make the call again.
For our case all our GH calls should work without falling back to using token. This will greatly reduce the number of GH calls we make without having us to worry about the number of parallel checks running (at least for now). Wdyt? @Ginxo @radtriste
sorry @shubhbapna but you are not considering the cases where repositories are private right? like https://github.com/kiegroup/droolsjbpm-build-bootstrap/blob/main/.ci/product-projects-dependencies.yaml Am I wrong?
letting default to no token is dangerous as GH will based the rate limit on IP address, which may be the same for a lot of tools running on PSI ... I would keep anyway the token Could we use instead a token pool ? I think that would worth it. For example, for now, we have:
With the pool, the build-chain would have multiple possibilities of tokens and try them, one after the other it there is a rate limit on one. wdyt ? we could even add more tokens if needed.
letting default to no token is dangerous as GH will based the rate limit on IP address, which may be the same for a lot of tools running on PSI ... I would keep anyway the token Could we use instead a token pool ? I think that would worth it. For example, for now, we have:
* Drools8 using kie-ci4 * Kogito using kie-ci3 * OP8 using kie-ci5
With the pool, the build-chain would have multiple possibilities of tokens and try them, one after the other it there is a rate limit on one. wdyt ? we could even add more tokens if needed.
it makes sense... what about changing https://github.com/kiegroup/github-action-build-chain/blob/2a020aba5ed5ebd15199447e728318f801a39497/src/domain/inputs.ts#L39 to token?: string[]
, change OctokitService to multion pattern (one per token) and getToken will return back one of them. We would have to document it in the way all token should have the same rights.
Additionally to it we can wrap any octokit call by a try catch block in order to retry up to X times (3?)
wdyt @shubhbapna ?
sorry @shubhbapna but you are not considering the cases where repositories are private right? like https://github.com/kiegroup/droolsjbpm-build-bootstrap/blob/main/.ci/product-projects-dependencies.yaml Am I wrong?
Hence the retry right? Either way we can pass an option to disable retry and use token directly for such cases
letting default to no token is dangerous as GH will based the rate limit on IP address, which may be the same for a lot of tools running on PSI
That's true...
With the pool, the build-chain would have multiple possibilities of tokens and try them, one after the other it there is a rate limit on one. wdyt ?
So if we had this before:
Drools8 using kie-ci4 Kogito using kie-ci3 OP8 using kie-ci5
After implementing the token pool will it be something like this:
Drools8 using kie-ci3, kie-ci4, kie-ci5 Kogito using kie-ci3, kie-ci4, kie-ci5 OP8 using kie-ci3, kie-ci4, kie-ci5
Or
Drools8 using kie-ci3, kie-ci4, kie-ci5 Kogito using kie-ci6, kie-ci7, kie-ci8 OP8 using kie-ci9, kie-ci10, kie-ci11
The first case will still result in rate limit issues IMO
As they will share the pool, it may be that sometimes OP needs more or Drools needs more (or Kogito) We can start with solution 1, sharing same tokens. and if not enough, we can still expand the number of tokens and redistribute.
I've been checking number of github API calls and this is the result
some of them are repeated, like
this.octokit.pulls.list
but they are required in order to check whether the are PR either from source or from target... So I don't think they can be reduced... @shubhbapna @radtriste