Open Vinoth414 opened 2 weeks ago
Heya - sorry I don't quite understand what the issue is. Are you facing rate limits from the GitHub api? Could you also provide logs if they are relevant?
let us consider a organization has more that 1400 repos and i had given 2 branch to index in the config file. then the process call function to get all the branches in the repo and micro match it after few API call to get the branches end point it starts to throw rate limit issue and the specified branches are not indexed properly.
Ah I see what your are saying - to confirm, have you specified a token
in your config file? The GitHub docs specify that the rate limit for API requests is 5,000 per hour when a token is provided, but only 60 per hour when there is not token.
Regardless, I think the longer-term thing here is to source the set of branches & tags from the checked-out git repository (since the information will already be there). That way, we don't need to hit the list branches or list tags endpoints.
hi , i had find a new way to check weather the branch is present or not .shall i share it or commit that changes
Sure - could you share your approach here?
instead of checking branches here https://github.com/sourcebot-dev/sourcebot/blob/main/packages/backend/src/github.ts#L123 we will user the cloned repo and run the following command git ls-remote -heads orgin and branches here in the repo path it will return the present branch and it also has potential to match wild cards like release/*
While we are indexing a larger GitHub repo with more than 800 repo where https://github.com/sourcebot-dev/sourcebot/blob/main/packages/backend/src/github.ts#L123 in this point we are micro matching the branch name by micro matching the repos .while getting the branches we are facing secondary rate limit issue. so it will be better to add a condition to check the user provided micro matchable branch names.
if you are ok with this changes. I am ready to contribute it and also i like to contribute for other changes too