DavidBrainard / RenderToolbox3

Matlab toolbox for managing graphics rendering for psychophysics
MIT License
11 stars 4 forks source link

How can we make the RenderToolbox3 repository smaller? #21

Closed benjamin-heasly closed 11 years ago

benjamin-heasly commented 11 years ago

Including example scene outputs, the RenderToolbox3 repository is quite large.

As of 9542e9ef280025a6b296f127877531f61f006cd6, my RenderToolbox3/.git folder is 2.4GB! This seems excessive.

Most of the space is taken up in ExampleScenes/, which contains lots of multi-spectral output data. These are useful regression checks, but there's no avoiding the fact that they're quite large.

We could find another place for these.

But how would we shrink the repository itself, now that all these data files are part of the revision history?

benjamin-heasly commented 11 years ago

This Stack Overflow discussion has some options: http://stackoverflow.com/questions/250238/collapsing-a-git-repositorys-history

benjamin-heasly commented 11 years ago

Luke Palmer's blog also has some options: http://lukepalmer.wordpress.com/2009/01/23/how-to-shrink-a-git-repository/

benjamin-heasly commented 11 years ago

Also Steve Lorek's blog has an option: http://stevelorek.com/how-to-shrink-a-git-repository.html

benjamin-heasly commented 11 years ago

It seems like there are two ways to go:

I think the main difference is that rebase is about the commit structure, whereas filter-branch can filter arbitrary files. For this issue, the commits are well contained, and the files are clearly labeled, so either way should be fine.

Both ways mess with the repository history, so users will have to re-clone from GitHub following the operation.

I'm not sure which way to go. I suppose we could try both, locally.

benjamin-heasly commented 11 years ago

I tried using the git-filter-branch approach, similar to Steve Lorek's approach (http://stevelorek.com/how-to-shrink-a-git-repository.html). The commands succeeded and the data files were removed from HEAD. However, the repository only lost a few hundred megabytes. Not enough.

I also tried cloning the repository after re-writing its history, as suggested on the git-filter-branch man page (http://gitmanual.org/git-filter-branch.html). This had no effect.

I don't understand what I missed in this process. I'm going to try the git-rebase approach.

benjamin-heasly commented 11 years ago

I tried the git-rebase approach, but I was unable to remove the large commits without causing merge errors.

I re-tried the git-filter-branch approach with a list of all "Output" files that had ever been part of any revision. This took 12 hours and still did not shrink the repository (even with cloning or garbage collecting).

So I made a new repository that contains only a few key commits:

This loses a lot of history, which is sad. But the repository is no longer bloated with huge output files.