Open gjost opened 2 years ago
We Put Half a Million files in One git Repository, Here’s What We Learned https://canvatechblog.com/we-put-half-a-million-files-in-one-git-repository-heres-what-we-learned-ec734a764181 To reduce the amount of work git needs to do to find changes, we used the fsmonitor hook with Watchman so we capture changes as they happen instead of having to scan all files in the repository every time a command is run. We also enabled feature.manyFiles, which under the hood enables the untracked cache to skip directories and files that haven’t been modified. Git also has a built-in command (maintenance) to optimize a repository’s data, speeding up commands and reducing disk space. This isn’t enabled by default, so we register it with a schedule for daily and hourly routines. Sparse checkout If an engineer can tell us what they usually work on, we can craft a checkout pattern that includes all the required dependencies to run and test their code locally while keeping the checkout as small as possible. Sparse checkout drawbacks:
https://news.ycombinator.com/item?id=31762245 Interesting
The Case Against Monorepos (Infoworld)
Trunk-Based Development: Monorepos (https://trunkbaseddevelopment.com/monorepos) monorepo.tools - Everything you need to know about monorepos, and the tools to build them (https://monorepo.tools)
Spend no more than 2 days on this.
The
ddr-densho-1000
is really huge and this causes usability problems even when the repo is checked out locally. In particular,git status
takes forever to run. Repo has tons of files and also a long history (~4000 commits).IDEA cp ddr-densho-1000 ddr-densho-1000new, remove .git/, git init where does the slow come from? TODO research git performance (num objects, size, repo age) TODO can we set git caching interval? TODO profile git operations does not correlate to number of objects of phsyical size of repo seems to be length commit history
Ways to improve git status performance (2012) https://stackoverflow.com/questions/4994772/ways-to-improve-git-status-performance 10 GB repo on NFS on Linux. First time git status ~36min, subsequent 8min
Slow Git Performance (2021) https://support.purestorage.com/Knowledge_Base/FlashBlade_KB/Slow_Git_Performance
OPTIONS
Shallow clone git clone --depth=50 --no-single-branch COLLECTION
Sparse checkout https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/ git clone COLLECTION git sparse-checkout init --cone git sparse-checkout set ...
Partial checkouts https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/ Blobless clones: git clone --filter=blob:none
Treeless clones: git clone --filter=tree:0
TODO Test shallow,sparse clones TODO test on Dana's machine