rtyley / bfg-repo-cleaner

Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
https://rtyley.github.io/bfg-repo-cleaner/
GNU General Public License v3.0
10.95k stars 540 forks source link

After running the bfg my repo is larger #309

Open woss opened 5 years ago

woss commented 5 years ago

I had few files that were bigger than 70MB which I wanted to remove, so I ran bfg --strip-blobs-bigger-than 10M myrepo then I got a massive amount of problems, like merging conflicts and over 300 Push/pulls per branch. after sorting that out, I saw an increase of 10%ish in the repo, from 145MB to 190MB.

Any idea what happened and how to fix this?

javabrett commented 5 years ago

Please post all commands you ran, starting from when you cloned the remote repo.

woss commented 5 years ago

I installed it via scoop( it's homebrew for windoiws) . BFG definition on scoop repo

cd myrepo
scoop install bfg
bfg --strip-blobs-bigger-than 10M . 
javabrett commented 5 years ago

This doesn't show how you cloned your repo (or is it local only), whether you ran a git gc after or any follow up commands.

andrevenancio commented 4 years ago

I have the same issue.

I've followed the instructions on your homepage

1) git clone --mirror git@bitbucket.org:COMPANY/REPONAME.git

2) java -jar bfg-1.13.0.jar --strip-blobs-bigger-than 100M REPONAME.git

3) cd REPONAME.git

4) git reflog expire --expire=now --all && git gc --prune=now --aggressive

5) git push

After this step my repo went from 1.35GB to 2.27GB on Bitbucket.

I have around 4k commits to push and to pull from the repo I have in my original folder (locally) not the mirror copy.

Not sure what to do now.

How can we reverse this? https://stackoverflow.com/questions/58412173/bitbucket-reduce-repo-size-with-bfg

aakoch commented 4 years ago

@woss Are you sure you cloned the repo the correct way?

@andrevenancio AFAIK, Bitbucket only reports an approximate size. What is the size of the REPONAME.git directory before and after?

I'd run a script before and after to find large files, such as:

git rev-list --objects --all \
   | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' \
   | sed -n 's/^blob //p' \
   | sort --numeric-sort --key=2 \
   | cut -c 1-12,41- \
   | $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 -- round=nearest | tail

I'm not a Git expert but I don't think you can reverse it. The website says you should create a backup of your repo. Did you?

scottmudge commented 2 years ago

Encountered this same problem, and now I'm locked out of the bitbucket repo. Went from 2.9 GB to 5.96 GB.

Followed the instructions verbatim from the BFG website.

woss commented 2 years ago

It seems that it does increase the size.

Also @aakoch I apologize for not answering. i realized even though I follow the steps my repo always became bigger so I decided to ditch this solution and not care for the time being. it has been almost 2 years and I cannot recall what is the repo in question.

scottmudge commented 2 years ago

Commenting to add that after contacting BitBucket, they manually ran garbage collection and the size went back down. I think it is a bug on BitBucket's side, as they use a custom GC process that is only manually triggered by their staff.

woss commented 2 years ago

@scottmudge when you say garbage collection do you mean git gc?

scottmudge commented 2 years ago

No, server-side BitBucket has their own proprietary gc process, separate from git gc. Which is why users cannot trigger it -- they need to open a support ticket and have BB staff do it.

Pretty annoying and, in my opinion, completely unnecessary. But obviously they made that choice for a reason.

raffian commented 1 year ago

Had same issue with gitlab 15.0.5 (on-prem), just use Housekeeping to clean up remotes.

Repo size: 249Mb

  1. git clone --mirror <myrepourl>
  2. java -jar bfg-1.14.0.jar --strip-blobs-bigger-than 1M myrepo.git
  3. cd myrepo
  4. git reflog expire --expire=now --all && git gc --prune=now --aggressive
  5. git push -f

Repo size: 349Mb (OH NO!) Gitlab -> Settings -> Advanced -> Housekeeping: Run Housekeeping

A few moments later...

Repo size: 161Mb πŸΈπŸ˜€