imjasonh / ideas

A place for me to file issues against myself for things I want to build when I'm bored
5 stars 0 forks source link

Git server that generates repos on demand #77

Open imjasonh opened 5 years ago

imjasonh commented 5 years ago

Like #74 but with git

The dumb HTTP protocol is simplest:

To start, do something like serve a repo that always contains a single file at its root containing the current time, as if someone was always force-pushing this file just before you pulled.

Maybe do something interesting with history, like add a commit every time the repo is pulled so it looks like they're just (non-force-) pushing the current time right before you pull.

You could use this with go get to generate go modules on-demand, like generated client libraries.

imjasonh commented 5 years ago

The dumb protocol asks for commits and parents and the server can respond with either pack files (bundles of objects) or loose-format commit objects with parent references, which are then walked until the client already has the parent. A generated-on-demand git repo could generate an endless stream of parent commits which are each individually fetched forever. That sounds fun!

imjasonh commented 5 years ago

When someone pulls:

  1. User requests GET info/refs; respond with tip commit SHA of each branch:
    {commit-sha}     refs/heads/master
  2. User requests GET HEAD to figure out which of those is the "head" it should pull; respond with "master":
    ref: refs/heads/master

    (this can be static if there's only one branch)

  3. User requests the "head" branch's master commit; respond with the object data served from GCS:
    GET objects/ca/82a6dff817ec66f44342007202690a93763949

To generate a commit SHA:

  1. write a blob object to GCS containing the zlib-compressed contents (e.g., the current time)
  2. write a tree object containing, effectively:
    100644 blob {blob-sha}      the-time.txt
  3. write a commit object containing:
    
    tree {tree-sha}
    author {name} {<email>} {time-in-seconds} {timezone offset}
    commiter {^ditto}

{commit message}


* every object has a header like `tree {size}\0` then the object contents.
* every object is written to an object named after the [SHA-1](https://golang.org/pkg/crypto/sha1/#example_New) of the zlib-compressed contents of the object

-----

If we don't care about simulating history, we can just generate and serve a blob/tree/commit each time. If we want to generate a commit every time the repo is pulled, we need to store some pointer to the current tip so we can point to it as a parent of the next commit (e.g., a GCS object called `tip` or something):

tree {tree-sha} parent {parent-sha} author {name} {} {time-in-seconds} {timezone offset} commiter {^ditto}

{commit message}



...then update the tip to contain the new tip's commit SHA when we respond. It's possible two concurrent fetches would race and one would respond with a tip that gets overwritten by the other, but whatever, what do you want from me, this is a silly hack.
imjasonh commented 4 years ago

I did some of this: https://gist.github.com/ImJasonH/85f48260448ae445facfa6b0fcf42551

imjasonh commented 4 years ago

Another version of this is a single repo that always reports new commits each time it gets a pull, like an infinite repo history.

It could do this by proxying a "real" repo on GH and pushing to it just before it responds to pulls, and forwards to the just-added-to repo.