biglocalnews / warn-github-flow

GitHub Action workflow for automating a WARN Act notice ETL pipeline
https://biglocalnews.org/content/tools/layoff-watch.html
Apache License 2.0
5 stars 0 forks source link

Repo size work #7

Open stucka opened 1 year ago

stucka commented 1 year ago

Need to document safer way to shrink the repo size

Need to remove the Notebook that does it in a breaking way

Maybe need to shrink some of the other branches

Close out https://github.com/biglocalnews/warn-github-flow/issues/5

Verify this is fixing https://github.com/biglocalnews/warn-github-flow/issues/4

stucka commented 1 year ago

Documentation start -- DO NOT USE THIS ---

Find initial commit https://stackoverflow.com/questions/18407526/git-how-to-find-first-commit-of-specific-branch git cherry master -v | head -n 1

Reset https://whitep4nth3r.com/blog/rewrite-git-history/

git checkout il git cherry main -v | head -n 1 21c0a1d18af1396158c7a24efa4b437bac2700ff git reset --soft 21c0a1d18af1396158c7a24efa4b437bac2700ff git add -A data/ git commit -m "Reset IL" git push -u -f origin il

git checkout wa git cherry main -v | head -n 1 a89842e13842d32605ff565d0194f5cdc8cbd3d3 git reset --soft a89842e13842d32605ff565d0194f5cdc8cbd3d3 git add -A data/ git commit -m "Reset WA" git push -u -f origin wa

git checkout transformer git cherry main -v | head -n 1 5511a48db894288cd6f22b8b89ebeb0dd8db0a62 git reset --soft 5511a48db894288cd6f22b8b89ebeb0dd8db0a62 git add -A data/ git commit -m "Reset transformer" git push -u -f origin transformer

stucka commented 1 year ago

Also maybe document how to find size of branches, e.g.: git rev-list --disk-usage --objects HEAD..il

Branches must be checked out for that to work. Maybe that's safely scriptable, but the Shrodinger's Notebook thing is in play.

stucka commented 2 months ago

Show branch sizes, in bash: git for-each-ref --format='%(refname)' | while read branch do size=$(git rev-list --disk-usage --objects HEAD..$branch) echo "$size $branch" done | sort -n

Results now ... why is there an origin/origin/wa? ... 0 refs/heads/main 0 refs/remotes/origin/HEAD 0 refs/remotes/origin/main 6963 refs/remotes/origin/ak 29528 refs/remotes/origin/ut 53318 refs/remotes/origin/sd 53580 refs/remotes/origin/hi 66334 refs/remotes/origin/al 134689 refs/remotes/origin/ct 291668 refs/remotes/origin/wi 348199 refs/remotes/origin/origin/wa 414156 refs/remotes/origin/mt 802293 refs/remotes/origin/id 1069308 refs/remotes/origin/ne 1523234 refs/remotes/origin/md 1540321 refs/remotes/origin/in 1811925 refs/remotes/origin/mi 2359496 refs/remotes/origin/tn 2804939 refs/remotes/origin/fl 3229237 refs/remotes/origin/ga 3673777 refs/remotes/origin/mo 4216396 refs/remotes/origin/tx 4833845 refs/remotes/origin/nj 5333310 refs/remotes/origin/or 6437243 refs/remotes/origin/la 6469633 refs/remotes/origin/sc 6500019 refs/remotes/origin/va 6663005 refs/remotes/origin/ny 7415177 refs/remotes/origin/ri 8531617 refs/remotes/origin/ia 8673717 refs/remotes/origin/de 17197681 refs/remotes/origin/oh 18038832 refs/remotes/origin/dc 18203431 refs/remotes/origin/vt 27108603 refs/remotes/origin/co 31563529 refs/remotes/origin/ca 33104387 refs/remotes/origin/ok 43845782 refs/remotes/origin/nm 51030901 refs/remotes/origin/ks 65325721 refs/remotes/origin/ky 70276832 refs/remotes/origin/az 96548777 refs/remotes/origin/me 333931979 refs/remotes/origin/wa 435718337 refs/remotes/origin/il 582069999 refs/remotes/origin/transformer

stucka commented 2 months ago

Tentative instructions:

Basic approach: Make a new directory, call it, whatever, git-rebuild. Change into it. Download the entire full repo. Make a backup copy of the entire full repo. Make another backup copy. Identify what state branches (plus "transformer") you want to try to shrink. Copy your "master" Pipfile and Pipfile.lock into the root of your new master directory, the git-rebuild

Verify this thing isn't going to run while you're moving so much stuff around.

For each statename: Make a complete copy of the entire full repo. Rename it to your statename. cd statename git checkout statename git cherry main -v | head -n 1 21c0a1dsomethingsomething git reset --soft 21c0a1dsomethingsomething Windows: copy ..\Pip . Unix: cp ../Pip . git add -A data/ git commit -m "Reset statename" git push -u -f origin statename cd ..

stucka commented 2 months ago

Adding Pipfiles ... causes more problems.

stucka commented 2 months ago

Possible workaround for states that had Pipfiles at that first commit ... trying with Kentucky. Tries to better isolate the commit to only the data directory, with a temporary .gitignore that should allow the Pipfile changes to come through. Maybe?

Tentative instructions:

Basic approach: Make a new directory, call it, whatever, git-rebuild. Change into it. Download the entire full repo. Make a backup copy of the entire full repo. Make another backup copy. Identify what state branches (plus "transformer") you want to try to shrink. Copy your "master" Pipfile and Pipfile.lock into the root of your new master directory, the git-rebuild

Verify this thing isn't going to run while you're moving so much stuff around.

For each statename: Make a complete copy of the entire full repo. Rename it to your statename. cd statename git checkout statename git cherry main -v | head -n 1 21c0a1dsomethingsomething git reset --soft 21c0a1dsomethingsomething echo "Pipfile" >>.gitignore echo "Pipfile.lock" >>.gitignore git add data git commit -m "Reset statename" git push -u -f origin statename cd ..