ku1ik / bitpocket

"DIY Dropbox" or "2-way directory (r)sync with proper deletion"
http://ku1ik.com/2011/07/18/bitpocket-as-a-dropbox-alternative.html
MIT License
1.03k stars 78 forks source link

Use internal (local) git repo to version selected files #15

Open ku1ik opened 12 years ago

ku1ik commented 12 years ago

There are some feature requests related to versioning synced files. I don't have implementing native versioning in plans because the goal of git-dude was and still is to be simple sync tool built on rsync.

However, I can see one, possibly working, solution based on git:

bitpocket could have internal, local, git repository (ie. hidden in .bitpocket/git) and working tree being your ~/BitPocket dir. All files could be ignored by default and you would run bitpocket track somefile to add it to git and start versioning. Later you could either run bitpocket snapshot at any time to commit all the changes from tracked files or run bitpocket sync --snapshot to commit+sync. This local git repository would be transferred with rsync to other machines like all other files resulting in all of the versions being accessible on all your machines.

This way the repository size can be reasonably small because of tracking only files you want (and you definitely don't want to track mp3 or avi files). Files not tracked by git would still be synced to other machines like they're now.

You could also be able to adjust _.gitignore file to ie. have all *.txt files automatically tracked without need for manual bitpocket track for each file.

Thoughts? Suggestions?

mindctrl commented 12 years ago

Interesting idea. I like the ability to specify which files to track individually, and the option for a wildcard to track *.txt.

It would be neat to have the option to flip that and have it default to tracking all with option to ignore tracking on wildcards *.mp3, kinda like .bitpocket/exclude does now for the sync function.

When you said you could run 'bitpocket snapshot' to commit changes from tracked files, were you meaning that you could make changes to revisions and they would be synced/merged back into the master? I'm kinda confused about that.

dkvdm commented 12 years ago

The problem with using git is that it does a complete sync of the repository, automatically syncing all revisions as well. This means that your usable size can be 20MB but the size of the repo can be as large as 1GB due to all revisions.

ku1ik commented 12 years ago

@flxfxp Notice that such disk usage (20MB -> 1GB) may be only the result of storing binary files which you normally should not store in git. Of course you may want to version your PNG files or sth but I wouldn't expect such dramatic disk usage (correct me if I'm wrong).

ku1ik commented 12 years ago

@mindctrl bitpocket snapshot would just do git add + git commit to master. Sync could could do one of two things:

dkvdm commented 12 years ago

@sickill sorry but I'm going to correct you :) The main reason for Dropbox for a lot of people to store any data. For some people this means documents, mp3s, applications, program code, psds, designs, etc. The key element of Dropbox is that it doesn't judge: it works for any type of file, and by using a cloud revision system it keeps the actual Dropbox size small, whilst still keeping all revisions.

kibiz0r commented 12 years ago

I agree with @flxfxp on this. To be a true Dropbox replacement, you must be able to version any kind of file without incurring a significant overhead.

The way I picture it is this:

ku1ik commented 12 years ago

@kibiz0r Yeah, this is something that can work. The files that are being tracked by git would be excluded from rsync transfer. They would be sent to main repo with normal git pull/push combo. Initially I thought that git repo could be rsync'ed but that could possibly lose commits by resetting HEAD when there were diverged branches on both ends. pull/push would be much safer, it could require human intervention in case of file conflicts though.

rennis250 commented 12 years ago

Would this be a good project to work form?

https://github.com/karalabe/gitbox/wiki

ku1ik commented 12 years ago

@rennis250 I'm not sure, I think rather not. But thanks for idea. Gitbox is built around Dropbox and bitpocket just around rsync.

rudolf commented 12 years ago

For git-like versioning of large files have a look at https://github.com/bup/bup

torfason commented 11 years ago

torfason@b7f82cbe343cac5cec0612f8f30cc32630310e17 provides another, somewhat different approach, to this, although the idea there (and in torfason@06f56e669db0116332df9c72babd1b3808bc03bc) is more about data safety than about versioning.

That approach stores history in local git repositories on each client, and they are never synced. However, this could be adapted to run git on the server instead, creating a single, authoritative git repository there. I still think the git repositories should not be synced (as in bidirectional syncing), however, one could imagine a command to rsync the server repository to the client to look at the history.

frank-dspeed commented 8 years ago

Hello frinds Syncing and versioning via git is already complet implamented via git annex v6 you can check in files and git annex sync and all that google it but i also forked of this script and will add git annex support to it the good news is you can use git annex with this script to even speed up syncs on realy big repositorys with some millions of files thats the use case where we found your script.

git annex is well but takes to long for our sync case.

git annex will add finally versioning of the backups into bitpocket and will make bitpocket one of the fastes 2 way sync tools for realy large datasets.