Open hTrap opened 4 years ago
From the scm link we need to implement capabilities:
Working directly with git objects is easier to implement but it is not that efficient. For example, if we are pushing the linux repository to arweave, then we'll have to create ~7614576 arweave data transactions for pushing all the objects. This can be simplified by pushing a single compressed packfile.
And that's why, our current implementation makes use of git's packfile. In every push we compare the latest remote ref on arweave with the local ref, find the list of objects that needs to be pushed and generate a packfile out of it; then generate a arweave data transaction. And there is no issue even if the packfile size is too large since Arweave supports data size upto 2^256 bytes. And git pull is basically fetching all the packfiles from arweave and checking out the latest ref. This is feasible for small to medium sized repositories, but the overhead will be too much in case of large repositories with lot of pushes.
We can improve on our current implementation by pushing a index of object-packfile mapping so that we don't need to download all packfiles during each pull, we just need to download the dependant packfiles during the recursive object download.
Is this the best approach or can we do better?
The current approach is fine, I think. Definitely use 2.0-style transactions, so you don't end up with tens of thousands of transactions, it will fit into a single one. This is what my implementation does and it works fine.
Note to visitors: there's a git remote helpers script at https://github.com/gitopia/git-remote-gitopia-mvp . It has a slightly different featureset and protocol.
The current implementation uses isomorphic git cli which would be difficult to maintain. Git remote helpers are used to push source code to different storage remotes using the helper tool written More info here:- https://git-scm.com/docs/gitremote-helpers
Certain examples of git remote helpers https://github.com/axic/git-remote-mango https://github.com/axic/mango https://github.com/anishathalye/git-remote-dropbox https://github.com/spwhitton/git-remote-gcrypt https://github.com/aws/git-remote-codecommit
Ideally the solution should look like this so that we can add support for other storage solutions in the same remote helper