TheTechTrap / Gitopia

Permanent versioning for your code.
https://gitopia.org
MIT License
4 stars 4 forks source link

Git Remote helper for arweave remote #1

Open hTrap opened 4 years ago

hTrap commented 4 years ago

The current implementation uses isomorphic git cli which would be difficult to maintain. Git remote helpers are used to push source code to different storage remotes using the helper tool written More info here:- https://git-scm.com/docs/gitremote-helpers

Certain examples of git remote helpers https://github.com/axic/git-remote-mango https://github.com/axic/mango https://github.com/anishathalye/git-remote-dropbox https://github.com/spwhitton/git-remote-gcrypt https://github.com/aws/git-remote-codecommit

Ideally the solution should look like this so that we can add support for other storage solutions in the same remote helper

$ git remote add arweave dgit://arweave/address/reponame
$ git push arweave master
hTrap commented 4 years ago

From the scm link we need to implement capabilities:

faza commented 4 years ago

Working directly with git objects is easier to implement but it is not that efficient. For example, if we are pushing the linux repository to arweave, then we'll have to create ~7614576 arweave data transactions for pushing all the objects. This can be simplified by pushing a single compressed packfile.

And that's why, our current implementation makes use of git's packfile. In every push we compare the latest remote ref on arweave with the local ref, find the list of objects that needs to be pushed and generate a packfile out of it; then generate a arweave data transaction. And there is no issue even if the packfile size is too large since Arweave supports data size upto 2^256 bytes. And git pull is basically fetching all the packfiles from arweave and checking out the latest ref. This is feasible for small to medium sized repositories, but the overhead will be too much in case of large repositories with lot of pushes.

We can improve on our current implementation by pushing a index of object-packfile mapping so that we don't need to download all packfiles during each pull, we just need to download the dependant packfiles during the recursive object download.

Is this the best approach or can we do better?

jespern commented 4 years ago

The current approach is fine, I think. Definitely use 2.0-style transactions, so you don't end up with tens of thousands of transactions, it will fit into a single one. This is what my implementation does and it works fine.

xloem commented 2 years ago

Note to visitors: there's a git remote helpers script at https://github.com/gitopia/git-remote-gitopia-mvp . It has a slightly different featureset and protocol.