jedbrown / git-fat

Simple way to handle fat files without committing them to git, supports synchronization using rsync
BSD 2-Clause "Simplified" License
621 stars 137 forks source link

S3 backend and some Win32 cherry-picks from upstream #28

Open willkelleher opened 10 years ago

willkelleher commented 10 years ago

We're using git-fat at our company and we found the two binary mode commits necessary for the smudge filer to work on Windows platforms.

I also made a few changes that abstract the backend push/pull operations to allow for different implementations because we tend to store our files in S3.

jedbrown commented 10 years ago

Sorry about the silence. I'm concerned about backward compatibility for .gitfat files. If compatibility must be broken, it should happen only once ever (I think named remotes are the way to go). The test suite currently fails:

$ ./test.sh 
+ git init fat-test
Reinitialized existing Git repository in /home/jed/src/git-fat/fat-test/.git/
+ cd fat-test
+ git fat init
Traceback (most recent call last):
  File "/home/jed/bin/git-fat", line 659, in <module>
    fat = GitFat()
  File "/home/jed/bin/git-fat", line 273, in __init__
    self.backend = self.get_backend(self.objdir)
  File "/home/jed/bin/git-fat", line 304, in get_backend
    raise RuntimeError('No supported backends specified in %s' % cfgpath)
RuntimeError: No supported backends specified in /home/jed/src/git-fat/fat-test/.gitfat
shakaran commented 9 years ago

Some progress with this? I am trying to use S3 with git-fat but this is still not merged and with conflicts. My only option at this moment is git-media

judgeaxl commented 9 years ago

Have you considered just using s3cmd for the S3 version? It's almost equivalent to rsync, but for s3. I've got a first test in my fork. I initially started playing with boto too, but it felt like reinventing the wheel too much.

willkelleher commented 9 years ago

@jedbrown After seeing the HN discussion, I'll work on getting this PR updated. We've made some progress on our fork since this was created.

Regarding the config file format, do you have any specific preferences?

jedbrown commented 9 years ago

@willkelleher That's great news. For the config file, I really want named remotes in every way analogous to Git remotes. I think the syntax could be like

[remote "foo"]
  url = rsync://user@host/path/
[remote "bar"]
  url = s3://.......

I don't know if I've talked with you about it, but I think it's worth making a Git backend for fat files (just tagged objects so they can be pushed and pulled independently) and I feel like `url = user@host:path/storer.git" should have that meaning (for consistency with normal Git remotes).

What do you think?

willkelleher commented 9 years ago

@jedbrown Do you mean you want to evolve git-fat into a custom Git object backend that would store the large file objects in some other backend but leave other files in the flat file object store?

I'm not very familiar with Git's support for custom backends, but it looks like libgit2 enables something relevant. How would you 'tag' the specific objects that you want to use custom storage vs. the traditional backend?

Let me know if I totally missed your point, but this sounds interesting.

jedbrown commented 9 years ago

@willkelleher I have noticed people using git-fat to store somewhat compressible files, such that the "dumb" object store takes a lot more space than a git packfile (and more network bandwidth). I also see an overhead associated with maintaining two access control lists (one for the Git repository and one for the fat object store). The mechanics can be done with Git's core command line tools or with libgit2. There is more discussion in Issue #1.

magec commented 8 years ago

Hi guys, im really interested on this feature, is this usable 'as is' now. I would like to migrate a current repo which uses git-fat into S3. Is this possible right now with this branch and without so much hacking??

gitfoxi commented 8 years ago

+1

zelonght commented 7 years ago

@magec you can try to use https://github.com/PersonifyInc/git-fat (a fork) we used it for years and we can pull/push files to S3 rather well, it can also support large file.