cjb / GitTorrent

A decentralization of GitHub using BitTorrent and Bitcoin
MIT License
4.75k stars 264 forks source link

No Pull Requests in the Protocol #20

Open alexanderkyte opened 9 years ago

alexanderkyte commented 9 years ago

People use public version control in general because it offers free public hosting, but they use github specifically because of how easy it is to fork a new repo and because of how nice the pull request workflow is. Git will never be distributed in any useful sense as long as the entire project's merge process and history are centralized and without an easy export format.

I'd like to consider using git itself to track this kind of thing, kind of like a "side chain" running on top of git. In another branch for a given repo, the state of bugs and PRs could be tracked.

rom1504 commented 9 years ago

There are some thoughts about that in http://blog.printf.net/articles/2015/05/29/announcing-gittorrent-a-decentralized-github/ in the "Closing thoughts" part.

cjb commented 9 years ago

Yeah! I mentioned Secure Scuttlebutt instead of Bugs Everywhere-style commits inside the repo, because you probably don't want to give everyone who might contribute a PR (who you don't necessarily know!) push access to your Git repo.

rom1504 commented 9 years ago

Some ideas related to gitlab (a self-hosted github) http://feedback.gitlab.com/forums/176466-general/suggestions/4488044-add-import-export-functionality-for-projects

alexanderkyte commented 9 years ago

It would be nice to make some node-webkit app that pulled PRs and stuff from store and presented a UI for altering this stuff.

cjb commented 9 years ago

@alexanderkyte Yep, see #13 for that!

ralphtheninja commented 9 years ago

+1 for using secure scuttlebutt for this!

cjb commented 9 years ago

@ralphtheninja Do you happen to know how it might work? I don't really understand secure-scuttlebutt yet.

So we want to have this stream of issues/pull requests, but we don't want to send every issue/pull request to every network node, so there needs to be a way to subscribe to a stream and avoid receiving gossip for that stream if you haven't subscribed, and you shouldn't need an "invite" to be able to follow issues and PRs for a repo..

alexanderkyte commented 9 years ago

Does secure-scuttlebutt allow me to query for an arbitrary element? Because the problem with a lot of these subscription-based distributed stores is that they make it really hard to say "I want to get everything so I can filter by a certain attribute, and I'm gonna do that about 80 times in 2 minutes." and expect a result any time soon.

ralphtheninja commented 9 years ago

@cjb Sort of. Basically you can publish any data to some feed and you can track that feed. /cc @pfraze @dominictarr

ralphtheninja commented 9 years ago

And the data published to those feeds is signed.

pfrazee commented 9 years ago

@cjb ssb is a mesh network of user blockchains. Peers gossip the chains (feeds) they follow. The follow-model keeps you from downloading the whole dataset, but it means you need to be subscribed to a user to receive messages from them. It's a bit like distributed twitter.

@alexanderkyte you're right, the most direct solution would be to create a supernode that spiders and indexes the whole network. Blockchains link to the chains they follow and announce public nodes, so the user graph is relatively easy to crawl.

cjb commented 9 years ago

@pfraze Thanks! Please could you give a little more detail on how ssb might be applied to storing issues and pull requests for a bunch of repositories? At first I was thinking there'd be one ssb "user" per repository, but then I think you'd have to share that user/repo's private key with everyone who wanted to file a new issue/PR, if you want to be able to later request every issue/PR for a given repo. Does that sound right?

dominictarr commented 9 years ago

@cjb you should not share private keys. instead post an issue on your own feed, and the owner of the project would hopefully see it. (ssb has a "follow" mechanism like on twitter, and clients replicate your direct friends and their friends) so there is a good chance you would see it - programmer communities are smaller than you'd think.

cjb commented 9 years ago

@dominictarr Thanks. The reason I didn't see doing that as a good idea is that I'm expecting this operation to be needed efficiently:

pull(
  ssb.createHistoryStream(repo.id),
  pull.collect(function (err, ary) {
    ...
cjb commented 9 years ago

I guess you could set up an account for a repo, keep the key private, and have some kind of set up where it auto-follows anyone who requests a follow, who can then post the issue on their own feed..?

pfrazee commented 9 years ago

@cjb ssb gossips the dataset rather than pulling it on-demand. Therefore you have a local cache of your friends' messages, and you can efficiently do a scan such as:

pull(
  ssb.messagesByType({ type: 'git-issue' }),
  pull.collect(function (err, ary) {
    ...

The real problem is the connectivity of personal networks. If you're not following every user that may post an issue (or one of their friends) you won't receive their issue/PR messages. Not a problem for a team or close-knit community, but not as easily global as, say, github

pfrazee commented 9 years ago

I guess you could set up an account for a repo, keep the key private, and have some kind of set up where it auto-follows anyone who requests a follow, who can then post the issue on their own feed..?

Yes, you'd just need a reliable way to contact the bot

cjb commented 9 years ago

Yes, you'd just need a reliable way to contact the bot

Thanks. If a GitTorrent repository's mutable key consisted of e.g.:

{
  "repositories": {
    "repo1": {
      "sha1": "aaaa..",
      "issues": "(ssb address)",
      "pullRequests": "(ssb address)"
    }
  }
}

Would that be good enough? Does an SSB address contain reliable contact information/everything you'd need to request a follow from that address?

pfrazee commented 9 years ago

Would that be good enough?

Enough for you to follow the bot, but not the reverse. The bot would need to be following one of your followers to get a message from you (via ssb)

splinterofchaos commented 9 years ago

If one needs to follow a user to see their issue reports and pull requests, how do new contributors get everyone to follow them? Would it be possible to put join requests into the network, or would we have to share this info via a mailing list or something?

alexanderkyte commented 9 years ago

One of the general problems with software development in general is keeping track of progress with people working asynchronously. People file bugs in trackers that aren't directly linked to the source control, or their documentation has nothing to validate that the arguments documented are even still arguments to the function after a refactor, or git blame only returns "fixed that file bug" as the commit message for a given commit with no idea which bug it is.

I'm against storing this in a format that isn't strongly associated with the repository. It's all good and well to find the most clever, lowest-bandwidth, most secure way to make PRs, but that won't help you unless it's easier for people to use it than for them not to use it.

I think that having this PR/bug information in the source control itself is the best way to go. Let's say someone googles some FFI library that has a horrific bug in it, and they wait around for a fix. They return much later and see that it was fixed months ago. When you're at commit T, you want to be able to know if issue Y still holds at that commit. This repository metadata is as much part of the source control as the comments are.

You can track this in a separate database(consistency is hard guys) or you can put it right in the git blockchain.

cjb commented 9 years ago

I think that having this PR/bug information in the source control itself is the best way to go.

I agree that this can be very nice, but the naive implementation requires giving write access to your Git repo to anyone who wants to file a bug/open a PR. Can we think of a decentralized way to improve on that?

alexanderkyte commented 9 years ago

I believe that closing a bug is semantically very much like making a merge. Someone opens a bug, the client makes a PR that adds an issue commit. Consensus happens on the bug existing or not. You could automate this with a bot that always merges in issue commits.

cjb commented 9 years ago

@alexanderkyte I'm asking how "the client makes a PR" works in a decentralized way. What exactly is communicated to whom, via which protocol?

alexanderkyte commented 9 years ago

I mean that message could happen over multiple channels. All the master repo needs is a hash. Here's a pseudo-protocol.

FILER:

* Merge upstream/master.
* Make new branch at upstream/master commit, add commit that makes new issue object.
* Share branch on gittorrent
* Post hash to irc/xmpp channel, with mention of bugbot name or send an email to an imap account the bugbot polls. (Email is actually really nice as a messaging system for a  federated distributed system.) Or send a peercoin transaction with some marginal amount and let the bugbot refund it when the PR is merged.

MAINTAINER:

* Either reviews issue/PR manually or allow it to be merged if it meets criteria.
* Post new master hash blockchain.
cjb commented 9 years ago

Thanks. Agreed, it is hard to beat email..

alexanderkyte commented 9 years ago

Yeah, one of the nice things about email is that it lets each person be as paranoid about owning the stack as they want. I can make a gmail alias for a project and give it a filter, or I can spin up a formally-verified imap server in my nuclear bunker and request that all communication happens over GPG. And all the filer needs to be able to do is to paste a hash into an email.

Also email is pretty good for talking about things. The linux kernel's mailing list archives are probably easier to search than some github bugs.

tabbyrobin commented 9 years ago

Are you guys familiar with/have you considered Matrix? I'd like to suggest it as an alternative to the email-layer as @alexanderkyte proposed:

* Post hash to irc/xmpp channel, with mention of bugbot name or send an email to an imap account the bugbot polls.

I feel like email is really great for human communication, but not so much for machine-to-machine communication. (I'm seeing the end goal as getting a list of pending/merged PRs in the UI.) Of course, email could still be used alongside this as an additional opt-in, for human notification.

Basically: Rather than paste the hash into an email, send the hash to the maintainer's node (or to the whole network?) in some JSON via Matrix. Fully p2p, and doesn't rely on external networks.

Quoting a few things from their homepage:

In fact, I think this would be a great infrastructure for the whole stack of community-oriented features: PMRs, new issues/comments on issues, watch/star/fork... (A new issue or comment can be viewed as a PMR on the issues-repo. Same for editing a wiki. User profile information can also be exchanged in this way.)

Nemo157 commented 8 years ago

You may want to take a look at https://github.com/google/git-appraise, this stores the pull requests as part of the repository. Again though I'm not sure how a workflow without push access to the repository would work with this.