Git server that generates repos on demand

imjasonh commented 5 years ago

Like #74 but with git

The dumb HTTP protocol is simplest:

To start, do something like serve a repo that always contains a single file at its root containing the current time, as if someone was always force-pushing this file just before you pulled.

Maybe do something interesting with history, like add a commit every time the repo is pulled so it looks like they're just (non-force-) pushing the current time right before you pull.

You could use this with go get to generate go modules on-demand, like generated client libraries.

imjasonh commented 5 years ago

The dumb protocol asks for commits and parents and the server can respond with either pack files (bundles of objects) or loose-format commit objects with parent references, which are then walked until the client already has the parent. A generated-on-demand git repo could generate an endless stream of parent commits which are each individually fetched forever. That sounds fun!

imjasonh commented 5 years ago

When someone pulls:

User requests GET info/refs; respond with tip commit SHA of each branch:
```
{commit-sha}     refs/heads/master
```
User requests GET HEAD to figure out which of those is the "head" it should pull; respond with "master":
```
ref: refs/heads/master
```
(this can be static if there's only one branch)
User requests the "head" branch's master commit; respond with the object data served from GCS:
```
GET objects/ca/82a6dff817ec66f44342007202690a93763949
```

any request to objects/ can be served directly from GCS, assuming we've written that object before.
need to figure out of 302-redirecting to GCS is allowed, or if we need to proxy the object data
objects/ paths are requested with the first two chars of the object SHA separated for probably-silly Git reasons. 🤷‍♂

To generate a commit SHA:

write a blob object to GCS containing the zlib-compressed contents (e.g., the current time)
write a tree object containing, effectively:
```
100644 blob {blob-sha}      the-time.txt
```

write a commit object containing:


tree {tree-sha}
author {name} {<email>} {time-in-seconds} {timezone offset}
commiter {^ditto}

{commit message}


* every object has a header like `tree {size}\0` then the object contents.
* every object is written to an object named after the [SHA-1](https://golang.org/pkg/crypto/sha1/#example_New) of the zlib-compressed contents of the object

-----

If we don't care about simulating history, we can just generate and serve a blob/tree/commit each time. If we want to generate a commit every time the repo is pulled, we need to store some pointer to the current tip so we can point to it as a parent of the next commit (e.g., a GCS object called `tip` or something):

tree {tree-sha} parent {parent-sha} author {name} {} {time-in-seconds} {timezone offset} commiter {^ditto}

{commit message}



...then update the tip to contain the new tip's commit SHA when we respond. It's possible two concurrent fetches would race and one would respond with a tip that gets overwritten by the other, but whatever, what do you want from me, this is a silly hack.

imjasonh commented 4 years ago

I did some of this: https://gist.github.com/ImJasonH/85f48260448ae445facfa6b0fcf42551

imjasonh commented 4 years ago

Another version of this is a single repo that always reports new commits each time it gets a pull, like an infinite repo history.

It could do this by proxying a "real" repo on GH and pushing to it just before it responds to pulls, and forwards to the just-added-to repo.

imjasonh / ideas

Git server that generates repos on demand #77