jedbrown / git-fat

Simple way to handle fat files without committing them to git, supports synchronization using rsync
BSD 2-Clause "Simplified" License
621 stars 137 forks source link

Support files other than binary #53

Closed danpolanco closed 9 years ago

danpolanco commented 9 years ago

Perhaps it's my own mistake, but I haven't been able to git-fat to work on, for example, text files.

The reason I want to do this:

I work with large genomic sequence files. These files can be as big as ~3GB in my case. For now I'm going to just tar them and store them, but it would be nice if I didn't have to do that so that I could make my research very easy to reproduce.

My workflow would be as follows:

Git add source code ---> Git fat add sequence data --> commit

Boom! At any point, someone can reproduce that point in history that I was working. For now it is going to be as follows:

Git add source code ---> tar compress data ---> git fat add data ---> repeat for new data

The extra step seem unnecessary from my perspective. I tried git-annex but it was a bit complicated for my needs. I like how simple git-fat is, especially since I want other people to be able to easily reproduce the things I do.

jedbrown commented 9 years ago

Do you have a unique suffix for your FASTA files? If so, just add that pattern to .gitattributes. There is nothing about git-fat that only works for binary files, though Git's native compression would provide some nontrivial compression and if you have many substantially similar FASTA files, you might find that plain Git is good enough.

danpolanco commented 9 years ago

Ok. I tried it before and I couldn't get it to work. Then I tried it with a compressed file and it worked. But! I had been messing with both git-annex and git-media, so maybe I didn't clean everything out correctly. I'll try again.

jedbrown commented 9 years ago

I'm closing this issue for now. If you have problems, please reopen and include a reproducible test case. Thanks.

danpolanco commented 9 years ago

I doubt I will. I think I figured it out. Thanks for you help.

danpolanco commented 9 years ago

It works! Brilliant. It was my own mistake. Thanks again. This works brilliantly for reproducible research.

Have a great day :grin: