Closed splinterofchaos closed 9 years ago
@splinterofchaos Thanks, looks great! Could I persuade you to adopt "JS Standard Style" for this diff? You can read more here, and fix up everything it complains about when you run it:
https://github.com/feross/standard
(I'm using the linter-js-standard
plugin for the Atom.io editor to show these while I type.)
Also I wonder if instead of this:
for (var i = 0; i < target.length - 1; i++) {
path += target[i] + '/'
if (!fs.existsSync(path)) {
fs.mkdirSync(path)
}
}
you could do this:
target.forEach(function (segment) {
path += segment + '/'
if (!fs.existsSync(path)) {
fs.mkdirSync(path)
}
}
I rebased added a few more commits. At first, this branch made git eagerly try to fetch every branch, but now, only when git tells us to fetch a branch, do we. Mainly, git will only ask for HEAD
and branches in refs/heads
, and only if it doesn't already have them. The userProfile
became too large to upload every branch, so for now I've disabled branches not starting with refs/heads
.
One can use $ git ls-remote {name}
to discover all the references it has, but if the name equals gittorrent://github.com/cjb/gittorrent
, many of the sha's listed may not actually be hosted on the network since we consult github, not the dht.
This is a recent session:
tmp$ git clone gittorrent://github.com/cjb/gittorrent
Cloning into 'gittorrent'...
origin has:
Okay, we want to get HEAD: 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Okay, we want to get master: 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Adding swarm peer: 192.168.1.7:30000 for 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Adding swarm peer: 192.34.86.36:30000 for 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Downloading git pack with infohash: 5ba1e3c62379bd48b289b5251b9225323e68ed88
Receiving objects: 100% (177/177), 27.78 KiB | 0 bytes/s, done.
Resolving deltas: 100% (95/95), done.
git update-ref origin/HEAD 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
git update-ref origin/master 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Checking connectivity... done.
tmp$ cd gittorrent/
gittorrent git:master$ git pack-refs
gittorrent git:master$ cat .git/packed-refs
# pack-refs with: peeled fully-peeled
61579b5ee99d9a51ad94ec14a205d4f96cefa6b5 refs/remotes/origin/master
gittorrent git:master$ git ls-remote origin
origin has:
61579b5ee99d9a51ad94ec14a205d4f96cefa6b5 HEAD
61579b5ee99d9a51ad94ec14a205d4f96cefa6b5 refs/heads/master
4e440776743ac216d19a7fc53c83a0681fdbf45b refs/pull/16/head
90a17f1bb2dfa953898a9056d5778bd5ceaa08b4 refs/pull/19/head
7d324de111b7b5711e31ef73055f385434cbf513 refs/pull/26/head
ae731139b21cac80ead9dc85c63e7aa8fe2ce26b refs/pull/26/merge
5141cf86c3202f8cdd7ccb63776c571a2a707f75 refs/pull/27/head
8af3f08f4e515df2d7cd2f3334474a6bcf583ad8 refs/pull/27/merge
aee9dd69178f6f76ecc00ba99961e31fe140d1f6 refs/pull/28/head
c5e6b9c9bd0367813c27fee5b664ed94a1adb6aa refs/pull/28/merge
bfe8b17eec315d38070417bf45d5cbcbbb74dc18 refs/pull/29/head
ff97c6721aaa0daed208cf99013be177fb642bdb refs/pull/30/head
7f4f50262b15569a6e60fd69daf74aefd74bd081 refs/pull/33/head
b8249848a0655491b8e21629d240f0f761c4db2d refs/pull/33/merge
616feea3827c295a99653446b5d799d76a4cba3c refs/pull/34/head
593acd662e28eecccc4ff2a83a26727952cb7f0e refs/pull/34/merge
cce6953283ed2456d15ec63787259ce84a7063af refs/pull/7/head
gittorrent git:master$
To the best of my knowledge, $ git clone
should produce the same tree for gittorrent repositories as other types.
Also I wonder if instead of this: [manual for loop]
:+1:
@splinterofchaos That's awesome, thanks! Do you think it's ready to merge in now?
BTW: here's the output for gittorrentd
:
src$ ./GitTorrent/gittorrentd
in repo GitTorrent/.git/git-daemon-export-ok
GitTorrent/.git/
{"repositories":{"GitTorrent":{"HEAD":"5c16fe1bbb35fb8fdbc5f2063a9fdc5603845079"}}}
Announcing 5c16fe1bbb35fb8fdbc5f2063a9fdc5603845079 for branching on repo GitTorrent/.git/
Announcing 5141cf86c3202f8cdd7ccb63776c571a2a707f75 for glob-cwd on repo GitTorrent/.git/
Announcing 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5 for master on repo GitTorrent/.git/
{"repositories":{"GitTorrent":{"HEAD":"5c16fe1bbb35fb8fdbc5f2063a9fdc5603845079","refs/heads/branching":"5c16fe1bbb35fb8fdbc5f2063a9fdc5603845079","refs/heads/glob-cwd":"5141cf86c3202f8cdd7ccb63776c571a2a707f75","refs/heads/master":"61579b5ee99d9a51ad94ec14a205d4f96cefa6b5"}}}
errors= []
hash= e743222bc6010e5080cba6706a282e9d756977c8
errors= []
hash= e743222bc6010e5080cba6706a282e9d756977c8
Received handshake for 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
calling git pack-objects
exited
and I removed the line "origin has:" from git clone
. (redundant)
Do you think it's ready to merge in now?
I forgot I'd marked it WIP.
I would like to see it merged. There are still issues, like sending userData
twice as seen above, and maybe a few miscellaneous things to work on, but, being a small and newer project, I don't think that's a bad thing and I'd like to see how it acts in the wild.
Let me know if you want me to change anything first, I'd be happy to.
Reaydi ---------- Переадресованное сообщение ---------- От: "Scott Prager" notifications@github.com Дата: 3 Июн 2015 г. 22:16 Тема: Re: [GitTorrent] [WIP] support branches (#34) Кому: "cjb/GitTorrent" GitTorrent@noreply.github.com Копия:
BTW: here's the output for gittorrentd:
src$ ./GitTorrent/gittorrentd in repo GitTorrent/.git/git-daemon-export-ok GitTorrent/.git/
{"repositories":{"GitTorrent":{"HEAD":"5c16fe1bbb35fb8fdbc5f2063a9fdc5603845079"}}} Announcing 5c16fe1bbb35fb8fdbc5f2063a9fdc5603845079 for branching on repo GitTorrent/.git/ Announcing 5141cf86c3202f8cdd7ccb63776c571a2a707f75 for glob-cwd on repo GitTorrent/.git/ Announcing 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5 for master on repo GitTorrent/.git/
{"repositories":{"GitTorrent":{"HEAD":"5c16fe1bbb35fb8fdbc5f2063a9fdc5603845079","refs/heads/branching":"5c16fe1bbb35fb8fdbc5f2063a9fdc5603845079","refs/heads/glob-cwd":"5141cf86c3202f8cdd7ccb63776c571a2a707f75","refs/heads/master":"61579b5ee99d9a51ad94ec14a205d4f96cefa6b5"}}} errors= [] hash= e743222bc6010e5080cba6706a282e9d756977c8 errors= [] hash= e743222bc6010e5080cba6706a282e9d756977c8 Received handshake for 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5 calling git pack-objects exited
and I removed the line "origin has:" from git clone. (redundant)
Do you think it's ready to merge in now?
I forgot I'd marked it WIP. I would like to see it merged. There are still issues, like sending userData twice as seen above, and maybe a few miscellaneous things to work on, but, being a small and newer project, I don't think that's a bad thing and I'd like to see how it acts in the wild.
Let me know if you want me to change anything first, I'd be happy to.
— Reply to this email directly or view it on GitHub.
@splinterofchaos Looks good! Merged, and added you as a collaborator -- feel free to weigh in, have some ownership, etc! (I'll push a small change to make git-remote-gittorrent
output a bit less verbose.)
@splinterofchaos This usually works, but I did get one failure:
λ git clone gittorrent://81e24205d4bac8496d3e13282c90ead5045f09ea/gittorrent
Cloning into 'gittorrent'...
Mutable key 81e24205d4bac8496d3e13282c90ead5045f09ea returned:
repositories:
gittorrent:
HEAD: 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
refs/heads/check-sha1: 7f4f50262b15569a6e60fd69daf74aefd74bd081
refs/heads/master: 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
recursers:
HEAD: 5fbfea8de70ddc686dafdd24b690893f98eb9475
refs/heads/master: 5fbfea8de70ddc686dafdd24b690893f98eb9475
Okay, we want to get HEAD: 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Okay, we want to get check-sha1: 7f4f50262b15569a6e60fd69daf74aefd74bd081
Okay, we want to get master: 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Adding swarm peer: 192.34.86.36:30000 for 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Downloading git pack with infohash: 5ba1e3c62379bd48b289b5251b9225323e68ed88
Receiving objects: 100% (177/177), 27.78 KiB | 0 bytes/s, done.
Resolving deltas: 100% (95/95), done.
git update-ref origin/HEAD 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
git update-ref origin/master 61579b5ee99d9a51ad94ec14a205d4f96cefa6b5
Checking connectivity... fatal: bad object 7f4f50262b15569a6e60fd69daf74aefd74bd081
fatal: remote did not send all necessary objects
It looks like our lookup for the 7f4f5.. sha1 failed, and we should retry it (issue #5) before moving on and asking Git to checkout.
Also, I guess this way means that for two branches that are near to each other, we download approx twice as much data as we need, because we get a full packfile from an empty repo for each branch?
I wonder if it would be worth getting HEAD first, and then asking for other branches and giving a "have (HEAD)". We can even parallelize it so we don't need to wait until we actually have the HEAD objects before we ask for a packfile from HEAD..somebranch, I think? Then as long as we wait to receive everything before we put it all together, we end up with every branch and don't have to download the same data a bunch of times. What do you think?
It looks like our lookup for the 7f4f5.. sha1 failed, and we should retry it (issue #5) before moving on and asking Git to checkout.
The todo
variable I introduced increments only when a peer actually has the hash--each of the "Adding swarm peer" outputs. Perhaps it should increment when we look up a new hash. The only problem there is the possibility of an infinite wait on a hash from github, but not in the network. If we poll the DHT instead of github for he references, then that shouldn't be a problem.
Also, I guess this way means that for two branches that are near to each other, we download approx twice as much data as we need, because we get a full packfile from an empty repo for each branch?
I wonder if it would be worth getting HEAD first, and then asking for other branches and giving a "have (HEAD)".
I've been thinking about this a little, but I need to learn more about how the DHT network and packfiles work.
So, if I have a repository with just master
, then I make branch A
with a few more commits, then branch to B
, I could fetch just B
and get the whole tree. If A
, B
, and master
all point to the same sha, then we will already do the right thing.
If I have multiple branches, some of them might already be in master
and master
might be in some of them. Optimally, any time two sha's have a common ancestor, C
, I want to fetch HEAD..C
and C..A/B/erc.
separately and eliminate branches already included.
I think that would be best, long term, but it might have a high implementation cost. Fetching HEAD
and the other branches relative to HEAD
would definitely be good right now and will probably be very close to optimal in most situations (slowest at cloning repositories with branches that have branches). Is the "have" and "want" stuff already implemented so that we're not sending the whole repository each time?
Since git checks for connectivity after we exit, I wonder if we don't have to worry about pulling in objects in the wrong order.
Yeah, I don't think order matters when pulling.
I think the actual right way to do this is described in issue #10 -- piping git-fetch-pack to git-upload-pack over the ut_gittorrent
transport. I think that would result in one packfile per repo with every ref inside.
:+1: neat
From the commit message:
WIP because it sometimes hangs,
get_infohash()
gets called twice instead of once, might be a quirk or two, and it's not the cleanest code I've ever written. I have a lot to learn about JS, git internals, and DHT's. Still, this branch enables me to:and when I have
$ ./GitTorrent/gittorrent
running, I can check outorigin/pull/27
, which I host.