sctchoi / gail

Automatically exported from code.google.com/p/gail
1 stars 0 forks source link

Git and binary files #1

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Problem: The current size of GAIL version 1 is about 70MB. 

Thanks to Jagadeeswaran for pointing out that GIT handles binary files poorly.  
In our workflow we need to store binary files in a common location with our 
text files.  These binaries may be .pdf files of papers, .eps files used to 
create our papers and presentations, or .mat files of output.  So we need to 
find a solution.

After spending some time on the internet, I have come across

    git-annex http://git-annex.branchable.com/
    git-fat https://github.com/jedbrown/git-fat
    git-bigiles http://caca.zoy.org/wiki/git-bigfiles

Also, there is some discussion about the git filter-branch 
http://gitready.com/beginner/2009/03/06/ignoring-doesnt-remove-a-file.html for 
really removing a file from the repository.

We need someone to work out a pleasant solution for us that is easy to continue 
using. 

Original issue reported on code.google.com by sou.chen...@gmail.com on 19 Sep 2013 at 6:01

GoogleCodeExporter commented 9 years ago
Xuan is repacking the GAIL_Dev repository to remove big binary files from the 
history. The GAIL_Dev repository is frozen between 9/26 and 9/27.

Original comment by sou.chen...@gmail.com on 27 Sep 2013 at 4:49

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Suggested steps:

(Step 1) Clone a local GAIL_Dev repository as backup in case Step 2 fails. Note 
note the CPU and/or clock time of cloning and the repository size.

(Step 2) GIT Repack

(Step 3) Clone a fresh downsized copy of GAIL_DEV to local machine and note the 
CPU and/or clock time of cloning and new size

(Step 4) Check all unit and doc tests are passing in the local repository

(Step 5) Edit this ticket by noting the detailed steps for Step 1 and 
observations for Steps 2 and 3. Change the "Status" from "Accepted" to "Fixed" 
if the solution is considered satisfactory. 

Original comment by sou.chen...@gmail.com on 27 Sep 2013 at 5:03

GoogleCodeExporter commented 9 years ago
(Step 1) The backup local repository was cloned at 15:11:36, 9/26/2013. The 
size was 278MB.

(Step 2) I followed the procedure in

http://git-scm.com/book/en/Git-Internals-Maintenance-and-Data-Recovery#Removing-
Objects

to remove some big .mat files from the history. However I did a rebase which 
created many duplicated commits. I think I should have done a forcing push 
here. The duplicated commits made the commit tree really messy and I was not 
able to remove them. So for now I have removed the entire history of the 
repository and pushed the backup repository.

(Step 3) As of 21:19, 9/27/13, the repository size was 59MB.

(Step 4) Check.

If we want the history back, I think Lan can do a forcing push.

Xuan

Original comment by xuanjz...@gmail.com on 28 Sep 2013 at 2:57

GoogleCodeExporter commented 9 years ago
Thanks, Xuan.

Original comment by hickern...@iit.edu on 30 Oct 2013 at 9:21

sctchoi commented 9 years ago

This was done.