Github rate limiting - Githubissues

ukd1 commented 10 years ago

Due to the changes here we're having issues with hitting the rate limit repeatedly during CI of our projects. This means we are struggling to use component.

Is this really necessary?
Is there a way to disable it?
Are we doing something massively dumb?

Russ

dominicbarnes commented 10 years ago

At our company, we have a shared "machine" github account that we use for things like CI. If you opt for doing the same thing, you can generate a token for that account and use it for your CI, as opposed to using a developers credentials.

As a consequence of using Github directly, instead of an external registry, there's no way around using their API for a great many features they offer. (like resolving tags/versions for a given component) As a result, each component client likely needs authentication with github to get past their rate-limiting.

ukd1 commented 10 years ago

@dominicbarnes yeah we've just set that up and insta hit it again trying to build everything, it's only a 5k limit. I think we'll have to have a separate account per project which seems like a non-ideal solution.

It also seems a poor choice that this is not a token, but username + password.

dominicbarnes commented 10 years ago

Oic, yeah I haven't started using this for CI just yet. (still just working on converting everything)

Fwiw, you can use a token instead of username/password if you use .netrc. But yeah, the 5k limit now is standing out to me as a distinct possibility for us to hit too. :(

airportyh commented 10 years ago

I just check in all my dependencies, which gets around this issue.

jonathanong commented 10 years ago

yeah or pin your stuff.

thinking of just creating a proxy to do all this stuff. this auth and remote stuff is super annoying

dominicbarnes commented 10 years ago

Yeah, the more I think about it a caching proxy would be a really elegant solution. (I'm sure Github and Bitbucket will appreciate it too)

I guess the biggest problem is getting it hosted.

chemzqm commented 10 years ago

Since github response could be broken frequently for me, I've made an API to checkout all the component stuff from github to my local machine and a local file resolver to copy the files instead of making remote call. It only takes milliseconds to rebuild :).

One thing I advice is to stick to semver version as much as possible, instead of using * to indicate any version should works in the app's component.json, component pin and component update are quite useful for that, that would save us some time for some extra remote call.

ukd1 commented 10 years ago

These all feel like worse solutions than before 1.0.0. It just worked then, this is a step back. It's complex, insecure and breaks if you use it a lot.

Tokens (#547) would be nice.

anthonyshort commented 10 years ago

I really don't know why we just don't store all the package names/locations/tags on a server somewhere and just call that instead. Hell, the clients could just download and update that locally and all the resolution would be done locally, then just fetch the files from github old-school. When you run crawl it could just post the updates to that server.

I could even try and build it today.

Using the API to get tags and files is going to cause lots of issues like this unless Github removes the rate limiting.

ukd1 commented 10 years ago

That or ask @github for an exception, somehow?

sankargorthi commented 10 years ago

:+1: @anthonyshort

tj commented 10 years ago

their ratelimiting is pretty ridiculous, I contacted them about that a few times but seems like they're not going to change it anytime soon, even authenticated users get ratelimited pretty hard. If we did proxy like @jonathanong mentioned, something like component.io/github.com/foo/bar/ (just an example) it would make keeping things up to date a little nicer than trying to mirror them all

anthonyshort commented 10 years ago

We'd need to have some sort of publish step with that wouldn't we? I don't really mind having to do that, it's better than other options :P

anthonyshort commented 10 years ago

@jonathanong have you started on this? I could take a look at it

tj commented 10 years ago

if we sent all requests through component.io we wouldn't need a publish step, no need to deal with extra accounts like npm whooppp

tj commented 10 years ago

I guess GH would ratelimit us to ~5000 through component.io though, that could still be a problem, it's kinda dumb that if you pay for a large account you still get 5000/h

vendethiel commented 10 years ago

can't you message them if you have a paid acc ? Seems like they could allow a bit more

tj commented 10 years ago

tried already, 5000 is probably fine for now though we don't even have that many components

anthonyshort commented 10 years ago

Yeah that's why I thought you'd need the "publish" step to tell it about the tags so it doesn't need to do any fetching.

tj commented 10 years ago

I know @jonathanong wanted to proxy through component.io to normalize the end-points anyway, which sounds like a good plan to me, cache duration would definitely be an issue if we don't have a publish step, but I really dislike publish steps :( I'm sure we can find a way around it

jonathanong commented 10 years ago

https://github.com/normalize/proxy.js

i have the backend dependency tree thing going already, but not the server yet.

jonathanong commented 10 years ago

noooooo! no publishing step!

tj commented 10 years ago

we can set stuff up in our clustaAAaaa

anthonyshort commented 10 years ago

just looked at the way bower does it, it calls git ls-remote --tags git@github.com:ripplejs/ripple.git to get the tags. It's slow, but you'd cache them. Remote adapters would create the git url.

Just another option.

regular commented 10 years ago

How about using git fetch to update the proxy's cache and then libgit2 bindings to find out about tags etc.? I had an email conversation six month ago about frequent problems with raw.github.com and their reply basically was: please don't use it, use git clone instead, we have much more server capacity to support the git clone use case. So I would guess there is no rate limit at all on git fetch.

bodokaiser commented 10 years ago

just as a help for people who get 404/"dependency could not be resolved" errors when using authentication: you must use an api token as password!

andreasgrimm commented 10 years ago

(maybe slightly OT) currently I'm hitting the github api limit. on my local developer machine when I use the cli component-build command and don't have remote dependencies installed yet, then the first time it downloads them from github into 'components' folder. all subsequent calls don't hit github's api anymore it seems.

Due to gulp-component npm module not working for me (outputs nothing at all), in a gulp task I implemented the example shown here: https://github.com/component/builder2.js with the effect that apperantly each build hits githubs api.

Can you tell me how I'd achieve the same behaviour of the cli component-build when incorporating the resolver manually?

[edit] Is it the resolver's local option I have to set to true ?

timaschew commented 9 years ago

closing
tokens are supported now, rate limit behaviour is documented and logged

even for travis-ci you can setup a secured token as env variable. you need to configure it for each repository, this seems to be a good strategy, not just for travis.

componentjs / component

Github rate limiting #546