VerbalExpressions / FSharpVerbalExpressions

Compoable F# regular expressions made easy
http://verbalexpressions.github.io/FSharpVerbalExpressions/
The Unlicense
38 stars 4 forks source link

Repo size is massive (82 MB); needs object pruning #7

Open jeroldhaas opened 7 years ago

jeroldhaas commented 7 years ago
$ git clone https://github.com/VerbalExpressions/FSharpVerbalExpressions.git
Cloning into 'FSharpVerbalExpressions'...
remote: Counting objects: 2498, done.
remote: Total 2498 (delta 0), reused 0 (delta 0), pack-reused 2498
Receiving objects: 100% (2498/2498), 82.36 MiB | 108.00 KiB/s, done.
Resolving deltas: 100% (1124/1124), done.
Checking out files: 100% (68/68), done.

Suspect files are in .git/objects/pack.

Running the script that removes unreferenced objects reduced file size to 3.0M, as seen below.

NOTE: see caveats in link above if these objects are desired to be kept in the repo's history (I'm assuming /bin files weren't ignored?)

$ ./git-gc-all-ferocious
Counting objects: 345, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (145/145), done.
Writing objects: 100% (345/345), done.
Total 345 (delta 175), reused 337 (delta 173)
$ du -h
320K    ./.git/hooks
64K     ./.git/info
0       ./.git/refs/heads
0       ./.git/refs/tags
0       ./.git/refs
192K    ./.git/objects/pack
32K     ./.git/objects/info
224K    ./.git/objects
768K    ./.git
96K     ./.paket
256K    ./docs/content
64K     ./docs/files/img
64K     ./docs/files
32K     ./docs/tools/templates
64K     ./docs/tools
384K    ./docs
32K     ./lib
320K    ./src/FsVerbalExpressions
320K    ./src
256K    ./tests/Email.Tests
96K     ./tests/FsRegEx.Tests
576K    ./tests/FsVerbalExpressions.Tests
928K    ./tests
3.0M    .
jeroldhaas commented 7 years ago

Also, if you're okay with this, I've already run the script on my local repo; when I get acquainted well enough with the library to work on the test items for #1 the changes can likely be accepted as PR

jeroldhaas commented 7 years ago

Also also there may be better methods to reduce the packed objects other than this. I did notice there was a number of solutions at that link, one which also looked good: "git-repack -a followed by git-prune-packed" for example. I tried the topmost.

jackfoxy commented 7 years ago

@jeroldhaas I foolishly had not marked my own project for watching, so I just now saw this. My apologies and thanks for contributing this issue.

Going forward I have deprecated this project in favor of https://jackfoxy.github.io/FsRegEx/index.html

I've already added a message about that to the nuget description of this project, and perhaps I should add an issue to that effect here.

I ran the script you suggested locally on master, and the messages appear to indicate it did compress the repo, although the disk space reported by the file system remains the same.

git status reports "nothing to commit, working directory clean". I suppose I would have to do a push --force to get this up to the github repo?

jeroldhaas commented 6 years ago

I found this (more comprehensive) article on compressing/culling repo object information: http://stevelorek.com/how-to-shrink-a-git-repository.html .

HTH

edit: will make note to try FsRegEx when time permits

taralx commented 5 years ago

I stumbled on this, and it amused me enough to comment. :smile:

You can clone with --single-branch or --depth to avoid it, but the big objects are in the history of the gh-pages branch at e656761946d7d04380f95a6de286925278335133.

If you care, you can remove it via force push:

git checkout gh-pages
git replace --graft 10bc7bff9c54a55d928f780ef54a4261d16ceaad
git filter-branch
git push --force-with-lease origin gh-pages:gh-pages