Closed adamburgess closed 8 years ago
Are you saying that https://github.com/jspm/github/pull/90 has broken the token case for downloading private Git? If so we should file a separate bug for that!!
With the protocol implementation, if you're not using it for anything else it may make sense to check it in here.
My worry with the approach is that it only provides https as the protocol, while I was under the impression we currently use the git:
protocol? Perhaps that is where the token-based approach works?
This is excellent work though - I'd love to get something in along these lines and in other places if it can work out. I guess sharing through a single package makes sense if it is used by the registry as well.
Are you saying that #90 has broken the token case for downloading private Git?
yes.
My worry with the approach is that it only provides https as the protocol, while I was under the impression we currently use the git: protocol?
Git has 3 methods: git+ssh (with pub/private keys), git+https (dumb), and git+https (smart). Github has disabled the dumb protocol. The smart protocol is extremely similar to the ssh protocol but using http authentication (this is what we use currently). It's also just as fast.
This PR implements one function of the smart protocol in javascript. (sorry if you already knew about git, explaining it anyway is easy)
I'll inline my code, will make bugs easier to fix if they pop up.
I think this should be all working now, I've tested with private repos (using username + token).
And yeah, you do need username. there are a few articles on github, e.g. https://help.github.com/articles/git-automation-with-oauth-tokens/ which specifically include username
Yeah, looks like this is a limitation of the git protocol. Note that this isn't Github's API, this is a Git operation.
Note that per discussion in #97 this has absolutely nothing to do API limits, private repos, or anything else. The only reason private repos are relevant here is because they're the only ones that need any authentication at all for doing git operations.
Now, all that said, this use of git ls-remote
can be replaced with a single Github API call (which should not require a username, only a token): https://developer.github.com/v3/git/refs/#get-all-references
here's the result for this repo: https://api.github.com/repos/jspm/github/git/refs
Perhaps that's worth considering.
Now, all that said, this use of git ls-remote can be replaced with a single Github API call
I believe that the reason the git protocol was used is because that it is faster than the API, at least that is what I thought this comment means: https://github.com/jspm/github/issues/66#issuecomment-138616583
Thanks @tamird the idea here is to move away from API calls entirely as they get rate limited. git remote-ls doesn't seem to be rate limited at all from what I can tell.
Ah yes - if the API was used, by default that would be 60 requests per hour unauthenticated. That would pretty much force everyone to use a token.
So yes, I guess we will need to bring back the username in the authentication process... https://github.com/jspm/github/issues/99.
Actually what if we used remote-ls for unauthenticated, and then the API for when authenticated??
I'm not sure what that comment means, but if we only want tags, there is also https://api.github.com/repos/jspm/github/git/refs/tags.
Git operations aren't rate limited? Seems hard to believe, but may it is so.
@guybedford since we can make a single API call to get the user name, why can't we just do it for the user instead of having them enter it (which is unusual for tools using oauth)?
@guybedford yeah, that'd work too.
@adamburgess am I right in thinking merging this would be blocked by https://github.com/jspm/github/issues/99 due to the API needing a username, but apart from that we'd be good to merge?
Yep, I think this would be ok to merge, apart from it completely failing if an auth token is set. I'd like to work on getting a fix for that, though. I should have some time tomorrow.
On 22 Jul 2016 1:17 AM, "Guy Bedford" notifications@github.com wrote:
@adamburgess https://github.com/adamburgess am I right in thinking merging this would be blocked by #99 https://github.com/jspm/github/issues/99 due to the API needing a username, but apart from that we'd be good to merge?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jspm/github/pull/95#issuecomment-234286635, or mute the thread https://github.com/notifications/unsubscribe-auth/AHILKL0P3rG1B45LxBT049SDTXvYkh2mks5qX42hgaJpZM4JDxmV .
Sure, thanks for the update. I've invited you as a collaborator so you can merge as you see fit.
It's taking a little bit of time to do this, mainly the handling of errors. Also, now that the API is involved, it'd make sense to get caching in as well. Thanks for inviting me! I'll probably get some tests in the PR too in that case, though I'm unsure of how exactly to to test private repos
That sounds like a great plan. Please just copy me in when you'd like for code review.
@adamburgess I'm wondering if we should just revert the previous authentication change. Do you think this would make sense?
Reverting the previous authentication change so that we can use a pure-node implementation of the git protocol? That seems backwards.
The previous authentication change allowed users to avoid giving real access to their repos to JSPM. You seem to be suggesting trading security for "cleanliness", which is a bad trade in my mind.
@tamird this is purely about the fact that we have a regression with private GitHub accounts, as described in the first response in https://github.com/jspm/github/pull/95#issuecomment-232007785.
Ah, so the other option is to move that operation to the API as described in
https://github.com/jspm/github/pull/95#issuecomment-232990527 https://github.com/jspm/github/pull/95#issuecomment-232992260
I can work on that if you'd like. I really would like to avoid reverting the authentication changes.
As discussed, the worry with the API approach is that it means introducing rate limiting for lookups and it will at least triple the bandwidth for lookups, which can already be quite large (eg try seeing what the output is for Bootstrap for all versions...).
@tamird could we not add the same PR we did to 0.16, to allow username and token or just token both as valid auth options?
it will at least triple the bandwidth for lookups
As mentioned in https://github.com/jspm/github/issues/99, that's not true if using two API calls, but it is true that it will increase the number of API calls.
By the way, it seems really odd that we're having to jump through all these hoops when we're really trying to do the more efficient thing from Github's perspective. Perhaps we could reach out to them and solicit feedback?
@guybedford sure, I will whip up a PR.
@tamird thanks so much for picking this up again, and understanding. For the size I just mean comparing the size of the output of https://api.github.com/repos/twbs/bootstrap/git/refs/tags to the size of the output of git ls-remote git@github.com:twbs/bootstrap
. This isn't a simple interface boundary, and this project does need to be highly optimized for network performance - so by its nature handling some complexity does go with that I think.
Note that size is not the only thing that matters. @adamburgess ran the numbers in https://github.com/jspm/github/issues/99#issuecomment-233007210.
@guybedford sorry for never getting this done, I'll push what I have soon, which works in the happy path
On 2 Aug 2016 3:21 AM, "Tamir Duberstein" notifications@github.com wrote:
Note that size is not the only thing that matters. @adamburgess https://github.com/adamburgess ran the numbers in #99 (comment) https://github.com/jspm/github/issues/99#issuecomment-233007210.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jspm/github/pull/95#issuecomment-236646685, or mute the thread https://github.com/notifications/unsubscribe-auth/AHILKIeR56GC1FogcvHVuZGWO8cNp9EYks5qbir9gaJpZM4JDxmV .
As discussed, the worry with the API approach is that it means introducing rate limiting for lookups and it will at least triple the bandwidth for lookups, which can already be quite large (eg try seeing what the output is for Bootstrap for all versions...).
The API gives you 5,000 requests per hour. That means you'll be able to do probably around 2000 installs before falling back to the git protocol
With gzip, the size of the API response is similar to the git protocol, but with one benefit: it can be cached. i.e. really small 304 not modified responses if done correctly.
@guybedford @tamird I think what would be best is to ask for github username and token. (or, ask for token, and dynamically fetch the username when testing it)
This allows:
if you only ask for token, then once the token gets rate-limited private repos would be unavailable if you ask for username and password, well, apart from security issue as no one really wants to give a third-party their password, you can't use the API
@adamburgess thanks so much for the update here. I definitely like the suggestion of username and token, with a possible automatic lookup - worth thinking about.
Are these changes compatible / complementary with the PR at https://github.com/jspm/github/pull/102?
username + password doesn't work for API (using plain HTTP authentication), so no. but I think it's a small change to make to ask for username + token
oh and FYI: when this finally gets merged I'll cut the entire thing down to just a few commits in total, not 23 ;)
Merged! Thanks for your great work here.
hm. sounded like @adamburgess wasn't quite ready for this to be merged.
wait--
._.
the title is [do not merge]
@adamburgess ah.
@adamburgess you have commit access, so feel free to clean up the branch as you see fit.
@adamburgess since you have commit you should set up travis, too.
On Tue, Aug 2, 2016 at 10:25 AM, Guy Bedford notifications@github.com wrote:
@adamburgess https://github.com/adamburgess you have commit access, so feel free to clean up the branch as you see fit.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jspm/github/pull/95#issuecomment-236920643, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdsPEomGAHaTkxgYbPmLbJeD6wLu1owks5qb1NOgaJpZM4JDxmV .
let's just hope no one cloned the 0.17 branch in the interim :)
@adamburgess it's probably not a good idea to force-push. You'll have to live with the dirty history :(
On Tue, Aug 2, 2016 at 10:30 AM, Adam Burgess notifications@github.com wrote:
let's just hope no one cloned the 0.17 branch in the interim :)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jspm/github/pull/95#issuecomment-236922180, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdsPGiVpEYO1LMOK4jafunuvKUYp0yuks5qb1R0gaJpZM4JDxmV .
sigh, fine
how embarassing...the PR was mostly done, apart from error messages which are mostly unhelpful
It's only been 10 minutes, a force push is still ok within this window if you want to clean up the history. I don't mind doing it too quickly if you like.
ok, I've done that. I'll open another PR soon. right now it's a bit too late to continue fixing it, 12:42am and all :)
Thanks. No problem, when you can!
@adamburgess no pressure here, but I just wanted to check up if you plan to come back to this? If not I can go through and pull in these changes manually.
continued on from #94 (which used a loosely cobbled together npm:js-git and npm:nogit), this one uses a library I just made, npm:git-ls-remote-pure. If you like, I can inline it in this PR instead of using an external library - it's a single file. implementation is quite small, it fetches
https://github.com/user/repo.git/info/refs?service=git-upload-pack
and parses the response, which is what actual git does. this means no more git! (though jspm registry still uses git...)now, onto the problems the https git endpoint needs a username and password combo for private repositories (where password can/should be a token). this was broken by #90 - and note, native git will also fail here.
this can be fixed in two ways
login
property, which contains the token's username