git-lfs / git-lfs

Git extension for versioning large files
https://git-lfs.com
Other
12.65k stars 2.01k forks source link

LFS Migration not tracking everything I set up #3606

Open juniordiscart opened 5 years ago

juniordiscart commented 5 years ago

Hi

We used to host our pretty large Unity project on Unfuddle, but it kind of took too long to commit stuff since the repo grew so large because Unfuddle doesn't support LFS. I migrated our project to BitBucket which has LFS and this went well initially. I took a snapshot of the project, setup git completely from the beginning again and tracked the files using git LFS. around ±50GB was stored in LFS and the repo size was ±70MB. Nice!

Now, somewhere down the line, the repo size is now back at 1.88GB, and we are nearing the 2GB repo size limit before it goes into read-only mode and I can't figure out which files are causing the repo to grow. If I run git lfs migrate info --everything I get the following output:

migrate: Sorting commits: ..., done                                                                                                                                                                                                    
migrate: Examining commits: 100% (49/49), done                                                                                                                                                                                         
*.xml   42 MB       758/758 files(s)    100%
*.cs    19 MB     2666/2667 files(s)    100%
*.meta  17 MB   20630/20630 files(s)    100%
*.mat   6.5 MB    2563/2563 files(s)    100%
*.txt   4.2 MB        70/70 files(s)    100%

Which, when adding up, is about the size I expect, but if I run git count-objects -vH I get:

count: 0
size: 0 bytes
in-pack: 42655
packs: 1
size-pack: 1.32 GiB
prune-packable: 0
garbage: 0
size-garbage: 0 bytes

I imagine BitBucket still has some garbage collection to do to reduce the size to 1.32GB. If I run a script to find the largest files that are in the pack (found here: https://stackoverflow.com/questions/10622179/how-to-find-identify-large-commits-in-git-history) I get output:

All sizes are in kB's. The pack column is the size of the object, compressed, inside the pack file.
size    pack    SHA                                       location
318990  132148  f65ff8f593b5c8b028a46acb4541b41491e10129  Liftoff/Assets/Scenes/Environments/AutumnFields_Night/AutumnFields_Night.unity
233898  102099  fcd342c513ca7b2ef040d4b0416a00ad1c4b162b  Liftoff/Assets/Scenes/Environments/StrawBale_Night/StrawBale_Night.unity
230955  76027   17034a88044f315c190ca71a31010d6b647ec0c9  Liftoff/Assets/Scenes/Environments/PineValley/PineValley.unity
210622  176107  327f5a731956971c05fa673843dd99eff4b07d57  Liftoff/Assets/Scenes/Environments/Hall26_Night/Hall26_Night/LightingData.asset
193198  62087   135f5c0f6224cc78efa513a8c1154eecaceef52d  Liftoff/Assets/Scenes/Environments/DubaiLegends/DubaiLegends.unity
192828  94255   4f926d047373ff29c3501f5b3e2eb28d084adc45  Liftoff/Assets/Scenes/Environments/TheGreen_Night/TheGreen_Night.unity
190244  93765   0b0b7ac4c1dcf256ee67787d1286f357180795fb  Liftoff/Assets/Scenes/Environments/BardwellsYard_Night/BardwellsYard_Night.unity
165631  130822  6d3fb2d44c70795a156370f1ba9e572e19253bb7  Liftoff/Assets/Scenes/Environments/DubaiLegends/DubaiLegends/LightingData.asset
161133  137683  c5d4aa4c8f02225544f2cbbc8fb079d332c4fdbf  Liftoff/Assets/Scenes/Environments/MinusTwo_Night/MinusTwo_Night/LightingData.asset
116358  49797   4b4bdc743b12129c88c4af0ebcdc676bbcf45dbf  Liftoff/Assets/Scenes/Environments/Hannover_Night/Hannover_Night.unity

If I check the size of this largest file, I know it isn't the largest file in the project. The largest file in the project is being tracked by LFS (it's a .tiff file over 500MB).

My .gitattributes are as follows (and as you can notice, .asset and .unity are tracked)

*.sh eol=lf

## Unity ##
*.cs diff=csharp text
*.cginc text
*.shader text
*.mat merge=unityyamlmerge eol=lf
*.anim merge=unityyamlmerge eol=lf
*.physicsMaterial2D merge=unityyamlmerge eol=lf
*.physicsMaterial merge=unityyamlmerge eol=lf
*.meta merge=unityyamlmerge eol=lf
*.controller merge=unityyamlmerge eol=lf
*.unity filter=lfs diff=lfs merge=lfs eol=lf -text
*.prefab filter=lfs diff=lfs merge=lfs eol=lf -text
*.asset filter=lfs diff=lfs merge=lfs -text
## git-lfs ##
# 3D models
*.3dm filter=lfs diff=lfs merge=lfs -text
*.3ds filter=lfs diff=lfs merge=lfs -text
*.blend filter=lfs diff=lfs merge=lfs -text
*.c4d filter=lfs diff=lfs merge=lfs -text
*.collada filter=lfs diff=lfs merge=lfs -text
*.dae filter=lfs diff=lfs merge=lfs -text
*.dxf filter=lfs diff=lfs merge=lfs -text
*.fbx filter=lfs diff=lfs merge=lfs -text
*.FBX filter=lfs diff=lfs merge=lfs -text
*.jas filter=lfs diff=lfs merge=lfs -text
*.lws filter=lfs diff=lfs merge=lfs -text
*.lxo filter=lfs diff=lfs merge=lfs -text
*.ma filter=lfs diff=lfs merge=lfs -text
*.max filter=lfs diff=lfs merge=lfs -text
*.mb filter=lfs diff=lfs merge=lfs -text
*.obj filter=lfs diff=lfs merge=lfs -text
*.ply filter=lfs diff=lfs merge=lfs -text
*.skp filter=lfs diff=lfs merge=lfs -text
*.stl filter=lfs diff=lfs merge=lfs -text
*.ztl filter=lfs diff=lfs merge=lfs -text
# Audio
*.aif filter=lfs diff=lfs merge=lfs -text
*.aiff filter=lfs diff=lfs merge=lfs -text
*.it filter=lfs diff=lfs merge=lfs -text
*.mod filter=lfs diff=lfs merge=lfs -text
*.mp3 filter=lfs diff=lfs merge=lfs -text
*.ogg filter=lfs diff=lfs merge=lfs -text
*.s3m filter=lfs diff=lfs merge=lfs -text
*.wav filter=lfs diff=lfs merge=lfs -text
*.xm filter=lfs diff=lfs merge=lfs -text
# Fonts
*.otf filter=lfs diff=lfs merge=lfs -text
*.ttf filter=lfs diff=lfs merge=lfs -text
# Images
*.bmp filter=lfs diff=lfs merge=lfs -text
*.exr filter=lfs diff=lfs merge=lfs -text
*.EXR filter=lfs diff=lfs merge=lfs -text
*.gif filter=lfs diff=lfs merge=lfs -text
*.hdr filter=lfs diff=lfs merge=lfs -text
*.iff filter=lfs diff=lfs merge=lfs -text
*.jpeg filter=lfs diff=lfs merge=lfs -text
*.JPEG filter=lfs diff=lfs merge=lfs -text
*.jpg filter=lfs diff=lfs merge=lfs -text
*.JPG filter=lfs diff=lfs merge=lfs -text
*.pict filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text
*.PNG filter=lfs diff=lfs merge=lfs -text
*.psd filter=lfs diff=lfs merge=lfs -text
*.PSD filter=lfs diff=lfs merge=lfs -text
*.tga filter=lfs diff=lfs merge=lfs -text
*.TGA filter=lfs diff=lfs merge=lfs -text
*.tif filter=lfs diff=lfs merge=lfs -text
*.tiff filter=lfs diff=lfs merge=lfs -text
*.cubemap filter=lfs diff=lfs merge=lfs -text
# Etc...
*.a filter=lfs diff=lfs merge=lfs -text
*.pdf filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.dll filter=lfs diff=lfs merge=lfs -text
*.unitypackage filter=lfs diff=lfs merge=lfs -text
*.rns filter=lfs diff=lfs merge=lfs -text
*.reason filter=lfs diff=lfs merge=lfs -text
*.so filter=lfs diff=lfs merge=lfs -text
*.vwl filter=lfs diff=lfs merge=lfs -text
*.dylib filter=lfs diff=lfs merge=lfs -text
*.exe filter=lfs diff=lfs merge=lfs -text
*.spm filter=lfs diff=lfs merge=lfs -text
*.unity filter=lfs diff=lfs merge=lfs -text
*.prefab filter=lfs diff=lfs merge=lfs -text

You may notice that .unity and .prefab are present twice in the file. The lowest two lines have been added after I ran the migrate import command.

Hope someone can help me determine what's wrong and how to get my project back on track. Git lfs version that's been used: git-lfs/2.7.1 (GitHub; darwin amd64; go 1.12)

Thanks in advance!

bk2204 commented 5 years ago

Hey, thanks for writing in.

I think you've already identified the problem, which is that you have some very large Git blobs. I expect what happened is that somebody committed large blobs without using Git LFS, and as a consequence they got written into the repo as Git objects instead of LFS objects. This can happen if a user doesn't have Git LFS set up properly on their system when they commit (e.g., they haven't run git lfs install or an equivalent).

You can, of course, rewrite history with git lfs migrate import --everything --fixup, which will rewrite each commit to use the .gitattributes file in that commit to determine which files should be written as LFS objects. This will likely shrink your repository back down to a normal size, but of course it has the downside that all the Git object IDs will change.

juniordiscart commented 5 years ago

@bk2204 Thanks for your answer. I didn't ran the --everything --fixup combination before. However, I tried variants on it, including all of the branch name using the --include-ref= options.

git lfs migrate import --everything --fixup
migrate: override changes in your working copy? [Y/n] 
migrate: override changes in your working copy? [Y/n] Y
migrate: changes in your working copy will be overridden ...
migrate: Sorting commits: ..., done                                             
migrate: Rewriting commits: 100% (50/50), done                                  
  branch1           3eaef984d212ad8922d76f0b7459b107e735e8c7 -> 3eaef984d212ad8922d76f0b7459b107e735e8c7
  branch2                   9cb2cddc56ad246143db150d8f7be4d9cf4cccc6 -> 9cb2cddc56ad246143db150d8f7be4d9cf4cccc6
  branch3                   fd102c23eb6e3583f209e57df231aa431f602621 -> fd102c23eb6e3583f209e57df231aa431f602621
  branch4   b8f7d0d1ae921df95888de90b5d1162da6fd73bf -> b8f7d0d1ae921df95888de90b5d1162da6fd73bf
  master                    de27c4f054b0dff8e1fa029cde207311ed4c84b6 -> de27c4f054b0dff8e1fa029cde207311ed4c84b6
migrate: Updating refs: ..., done                                               
migrate: checkout: ..., done

As you can see, the output didn't change much about the repo (I guess the -> indicates that something should have changed if it did do something).

Do you perhaps have any idea of a command to find the blobs in a pack to which commit they are attached to?