Azure / ipam

IP Address Management on Azure
https://azure.github.io/ipam
MIT License
291 stars 96 forks source link

Git folder is large #295

Open picccard opened 4 months ago

picccard commented 4 months ago

Describe the bug

The repo is growing rapidly(!) Caused by every version of the 50mb asset zip file is beeing kept in the git history.

Could this zip file instead be generated on-demand from inside the deploy.ps1 script? Should the repo be cleaned up with git filter-branch or something similar to remove the large .pack file and lower the repo size?

Git clone fetches over 500mb

PS C:\git> git clone https://github.com/Azure/ipam
Cloning into 'ipam'...
remote: Enumerating objects: 5577, done.
remote: Counting objects: 100% (2470/2470), done.
remote: Compressing objects: 100% (788/788), done.
remote: Total 5577 (delta 1688), reused 2318 (delta 1577), pack-reused 3107
Receiving objects: 100% (5577/5577), 503.54 MiB | 10.88 MiB/s, done.
Resolving deltas: 100% (3443/3443), done.
PS C:\git>

Largest 5 files in the repo

PS C:\git\ipam> Get-ChildItem -File -Recurse -Force | Sort-Object Length -Descending | Select-Object -First 5 -Property Directory, Name, @{l="Size in MB" ; e={"{0:N2}" -f ($_.Length / 1mb)}}

Directory                     Name                                               Size in MB
---------                     ----                                               ----------
C:\git\ipam\.git\objects\pack pack-f54811749458d296170445b28f83394b70b19e89.pack 503.54
C:\git\ipam\assets            ipam.zip                                           52.83
C:\git\ipam\docs\images       ipam-logo.png                                      0.49
C:\git\ipam\ui                package-lock.json                                  0.43
C:\git\ipam\docs\api\images   postman_response.png                               0.25
DCMattyG commented 4 months ago

Thank you for pointing this out @picccard. At the current point in time the ipam.zip file is there by design for customers whom are operating in disconnected cloud environments (no internet access) so they can have a copy of the data that is all-inclusive of the dependencies. That said, this is clearly an issue and I'll need to think of a better way to handle this moving forward.

My biggest concern of using something like git filter-branch is will indeed rewrite the entire history of the repo, invalidating all existing clones that are out there at this point in time. Perhaps that not as big of an issue as I'm thinking it will be, but it's top of mind as we decide next steps to remediate this issue.

picccard commented 4 months ago

Is it not sufficient to include the ipam.zip as an asset to the release?

Customers could download the zip for the specific release they want or get the latest at https://github.com/Azure/ipam/releases/latest

DCMattyG commented 4 months ago

I don't disagree with this, and yes that is definitely a better way to handle this moving forward.

I just need to think more about the impact of rewriting the git history, which is what git filter-branch does.