ibm-js / delite

HTML Custom Element / Widget infrastructure
http://ibm-js.github.io/delite/
Other
68 stars 28 forks source link

cleanup git history #321

Closed wkeese closed 9 years ago

wkeese commented 9 years ago

Task and record of cleanup of unwanted files from delite history. Deliteful can do something similar.

I checked out http://rtyley.github.io/bfg-repo-cleaner but it doesn't do paths, only file names, so it didn't seem appropriate. Instead I'm following http://git-scm.com/docs/git-filter-branch.

(1) Get clean copy of repository:

$ git clone git@github.com:ibm-js/delite.git
$ cd delite

(2) Plus get my branches too, in particular jQuery:

$ git remote add bill /ws/delite
$ git fetch bill
$ for branch in $(cd /ws/delite; git branch |sed 's/\*//')
> do
> git checkout $branch
> done

(3) Then clear out most of the unwanted directories (from history):

$ time git filter-branch -f --tree-filter "git rm -rf --ignore-unmatch -q Button Rule _cometd _editor _sql _tree analytics atom av bench bidi calc charting collections color cometd crypto css3 data date demos dijit/tests/_BidiSupport dijit/tests/i18n dijit/tests/tree/treeTestRoot dijit/themes/a11y dijit/themes/claro dijit/tree dnd drawing dtl editor editorPlugins embed encoding flash form/manager form/nls form/resources form/templates form/uploader fx gantt gauges geo gesture gfx gfx3d grid help highlight html icons image io json jsonPath lang layout math mdnd mobile/app mobile/bidi mobile/build mobile/dh mobile/tests mobile/themes mvc off presentation rails resources robot rpc secure sketch socket sql storage store string templates/buttons testing tests/_BidiSupport tests/_editor tests/data tests/editor tests/gfx tests/gvt tests/i18n tests/images tests/layout tests/tree tests/wire themes/a11y themes/blackie themes/claro themes/lucid themes/nihilo themes/noir themes/soria themes/tundra timing tree treemap uuid validate widget/nls wire wireml xml xmpp" --prune-empty --tag-name-filter cat -- --all $(cd /ws/delite; git branch |sed 's/\*//')

(4) Get list of remaining files:

find . -type f |fgrep -v .git |sort >> currentFiles

rm -f historicalFiles
for hash in $(git rev-list master)
do
    echo $hash
    git checkout -q $hash
    find . -type f |fgrep -v .git >> historicalFiles
done

sort -u historicalFiles > uniqFiles

diff --new-line-format="" --unchanged-line-format=""  uniqFiles currentFiles > oldFiles

(takes 20 minutes)

(5) Edit oldFiles to remove names of files we want to keep. You can skip this step if you don't care about the history of files before they were moved to their current location.

(6) Remove remaining files from git history:

$  time git filter-branch -f --tree-filter "git rm -rf --ignore-unmatch -q $(tr '\n' ' ' < oldFiles)" --prune-empty --tag-name-filter cat -- --all $(cd /ws/delite; git branch |sed 's/\*//')

(6.5) You can check effects by calling git rev-list master |wc or git rev-list 0.3.0 |wc. It should have gone down from about 15,000 to about 2000 commits.

(6.6) Optionally can make clean copy of repository, shedding unused objects. The path in the git clone command is the path to the repository checked out in step 1:

$ cd ..; mkdir clean2; cd clean2
$ git clone file:///ws/clean/delite

(7) Push to ibm-js

$ git push --force
$ git push --tags --force
cjolif commented 9 years ago

cc: @pruzand (for deliteful when you will have time)

wkeese commented 9 years ago

OK, I pushed the rewritten history.