luxonis / depthai

DepthAI Python API utilities, examples, and tutorials.
https://docs.luxonis.com
MIT License
934 stars 233 forks source link

Cloned repo is large (433MB) #36

Closed itsderek23 closed 4 years ago

itsderek23 commented 4 years ago

.git/objects/pack (295 MB)

The largest chunk of this is from a large file within .git/objects/pack (295 MB):

-r--r--r--    1 dlite  staff   295M Feb  6 06:19 pack-0748331a0ae468ca5a9b5bee41418f549ba5da00.pack
 git verify-pack -v pack-0748331a0ae468ca5a9b5bee41418f549ba5da00.pack \
> | sort -k 3 -n \
> | tail -10
e15384968d47a2e872030a79ea0cb7dced9f55e6 blob   10699686 1693489 176982870
c25229a9a5baacc4d329b19607f57f68d4f5e1e7 blob   11669167 11239986 156585534
34b3a22cea76771375c88d2759aa397b19e41d72 blob   12001915 1873891 173316462
c228dda8b728160ef64ecfc93a2917105e611734 blob   13721024 12389225 264320842
92aa84d7737c0e7f2f18948efb6366b6fa3115dc blob   14480768 10688751 80481154
a27dd3f3b30dccfa84381d2e78db96ec31854af5 blob   14485504 10693136 100915888
453306b0fd84f7100aebe6faada019cbebddb6bb blob   17373318 2765131 167825520
a0b91c133e07bce25fe5391da0d3a3d7f3a7157e blob   17416012 2760363 151965320
a379aad59c03317f33353a71a866e51d1cfaae2b blob   23778880 21935094 276710132
0f249d5573b25944b99aa9a84176928a1b35f696 blob   42642112 39501577 179054474

These appear to be from the mechanical files which have been removed from the repo:

 git rev-list --objects --all | grep e15384968d47a2e872030a79ea0cb7dced9f55e6
e15384968d47a2e872030a79ea0cb7dced9f55e6 Mechanical_Models/BW1099_R3M2E3_KTHSNF.step
Dereks-MacBook-Pro:pack dlite$ git rev-list --objects --all | grep c25229a9a5baacc4d329b19607f57f68d4f5e1e7
c25229a9a5baacc4d329b19607f57f68d4f5e1e7 Mechanical_Models/BW1097_R2M2E2.zip

See this comment from another repo w/a similar issue on how to cleanup this disk space.

nn folder (66 MB)

du -h nn/
 52M    nn//object_detection_4shave
4.8M    nn//object_recognition_4shave/emotion_recognition
9.5M    nn//object_recognition_4shave/landmarks
 14M    nn//object_recognition_4shave
 66M    nn/

/cc @Luxonis-Brandon

Luxonis-Brandon commented 4 years ago

Thanks! So it seems if we purge the mechanical files will that help?

itsderek23 commented 4 years ago

Yes - that will go a long way.

On Tue, Feb 25, 2020 at 1:57 PM Luxonis-Brandon notifications@github.com wrote:

Thanks! So it seems if we purge the mechanical files will that help?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/luxonis/depthai-python-extras/issues/36?email_source=notifications&email_token=AAAB5SCGJ2PWW64L62UW7ELREWA2PA5CNFSM4K3RL5QKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM5PBLA#issuecomment-591065260, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAB5SCF3JCM3ZBSWJBT7K3REWA2PANCNFSM4K3RL5QA .

Luxonis-Brandon commented 4 years ago

Perfect, so I don't know how to do this but if it's not hard for you please do make it so. :-)

Luxonis-Brian commented 4 years ago

Bringing this up again because I'm having an almost impossible time updating a remote tester. Best speeds I can get is about 20kB/s, and I'm running into all kinds of time out issues and disconnects.

Luxonis-Brandon commented 4 years ago

Yes, agreed. When we go to develop, as in when we make develop into the main (default) branch, we will likely archive the whole repository and start from scratch to get back to standard practices. Which then should reduce the repo size a TON.

It was my practices at the beginning that caused the ballooning.

ColbyToland commented 4 years ago

OK, I've done this before the hard way but I was happy to find there is now an easy way.

Follow these instructions and use 1M as your file size threshold (instructions chose 10M). This will get the repo size down to 167MB from 771 MB on disk.

That said, you really need to move the model files out of the repo. The less good option is making them a separate repo and optionally making them a submodule. The better solution is using git LFS to track file revisions but keeping the binary data out of the repo.

If you start a repo from scratch with the develop branch it's 163 MB on disk - basically the same as if you follow the TL;DR instructions below. If you remove the resource/nn directory before creating the repo from scratch on develop then it shrinks it to less than 1 MB.

TL;DR Execute these commands on Ubuntu 18 to shrink the repo to 167 MB:

  1. Install brew. sudo apt install brew

  2. Add brew to path by updating ~/.bash_profile to have lines:

    export PATH="/home/linuxbrew/.linuxbrew/bin:$PATH"
    export MANPATH="/home/linuxbrew/.linuxbrew/share/man:$MANPATH"
    export INFOPATH="/home/linuxbrew/.linuxbrew/share/info:$INFOPATH"

    Then source ~/.bash_profile to load the new path variables.

  3. Install BFG. brew install bfg

  4. Filter the repo with BFG: bfg -b 1M

  5. Finalize the changes: git reflog expire --expire=now --all && git gc --prune=now --aggressive

  6. Convince yourself it works: du -s

    $ du -s
    167064  .
  7. Force push the diet soda that is the new repo: git push --force

Luxonis-Brandon commented 4 years ago

Thanks @ColbyToland ! So on the models, we are planning in develop to have these automatically downloaded from elsewhere. It sounds like git LFS will be a great approach. CC: @themarpe .

Luxonis-Brandon commented 4 years ago

The model downloader seems to be working well in https://github.com/luxonis/depthai/pull/242. So this will allow us, once we get it into main to rewrite the Github history to get the repo size down to ~1MB or so.

SzabolcsGergely commented 4 years ago

FIxed. Repo size was reduced by rewriting repo history.

Luxonis-Brandon commented 4 years ago

Thanks, Szabi. And thanks @ColbyToland for the advice here. :-)