TeleMidia / ginga

A Ginga iTV middleware implementation by TeleMídia/PUC-Rio
http://ginga.org.br
GNU General Public License v2.0
51 stars 8 forks source link

Repository size is huge due to versioned video files #150

Closed JPTIZ closed 2 years ago

JPTIZ commented 5 years ago

Cloning here took 1 hour long due to 100MB video files from tests-ncl/samples directory (git downloads them in a very low rate for some reason - these files where being downloaded in ~40KiB/s, while the other files went full ~3 or 4MiB/s).

Since these are large binary asset files not essential for building Ginga, I recommend to store them externally (in a trusted/reliable URL - possibly in PUC-RIO domain) and leave instructions to download them separately when someone wants to run these tests.

alanlivio commented 5 years ago

Hello João Paulo. Thank you by your feedback.

After save externally, I believe that these files will still be in the git tree. So, after save externally, how can I prevent next clones download them?

Best regards.

-- Alan L.V. Guedes, TeleMídia/PUC-Rio

On Mon, Jun 3, 2019 at 3:56 PM João Paulo Taylor Ienczak Zanette < notifications@github.com> wrote:

Cloning here took 1 hour long due to 100MB video files com tests-ncl/samples directory. Since these are large binary asset files not essential for building Ginga, I recommend to store them externally (in a trusted/confiable URL - possibly in PUC-RIO domain) and leave instructions to download them separately when someone whats to run these tests.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/TeleMidia/ginga/issues/150?email_source=notifications&email_token=AAETVCEJ7IUNPLKNYRM3UCTPYVSNJA5CNFSM4HSOIMS2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GXLNTEQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAETVCA7HN7TFOPELJVM2LLPYVSNJANCNFSM4HSOIMSQ .

JPTIZ commented 5 years ago

I took a look into git lfs (Git Large-File-Storage) just now. It was created by GitHub and so it has native support (with 1GB bandwidth and storage limit for free accounts).

It has some features for migration that also cleans the git history for you. I think you can follow this tutorial, which has sections for migration from pre-existing repositories. Further documentation of git lfs migrate can be found in this git-lfs official repo doc.

(Also, I just realized that these aren't all video files, but actually one 88MB bunny.ogg plus a couple of other ~4MB files - one being a video:)

ginga/tests-ncl/samples $ durt -s *
        5 B  msg-stop.txt
        6 B  msg-abort.txt
        6 B  msg-pause.txt
        6 B  msg-start.txt
        7 B  msg-resume.txt
      108 B  page.html
      446 B  text.txt
      552 B  area.lua
      819 B  randomprop.lua
    1.18 kB  fps.lua
    3.20 kB  vector.svg
   31.87 kB  felis.jpg
   64.00 kB  gnu.png
  240.73 kB  small.mp4
  511.64 kB  clock.ogv
  633.14 kB  night.avi
  947.52 kB  bunny.mp3
    4.07 MB  animGar.mp4
    4.39 MB  arcade.mp3
   89.79 MB  bunny.ogg

To clean commit history, another (not so interesting in my opinion) approach is using git's filter-branch command, but you won't need it if you use git lfs migration tool.

Ronkiro commented 2 years ago

Hello, there's actually a tool to solve this problem

https://rtyley.github.io/bfg-repo-cleaner/

alanlivio commented 2 years ago

Hi @Ronkiro. I really appreciate your suggestion because the bfg tool is very fast. I ran the following line with bfg tool and get reduced the .git folder from 277M to 138M. The deleted dirty is in the attached log for the record.

java -jar ~/Downloads/bfg-1.14.0.jar -D "*.{dll,pdb,*.lib,*.dylib,*.so,*.o,*.obj,*.gch,*.pch,*.la,*.a,*.lai,*.exe,*.o
ut,*.class,*.log, *.tlog,*.obj, *.exp}" -fi "{*.mp4, *.mp3, *.png, *.cpp, *.h, *.ico, *.gif, *.css}" --strip-blobs-bigger-than 1M ginga`
cd ginga
git reflog expire --expire=now --all && git gc --prune=now --aggressive

changed-files.txt deleted-files.txt

Ronkiro commented 2 years ago

Nice!! Glad to see i could help!