armbian / build

Armbian Linux build framework generates custom Debian or Ubuntu image for x86, aarch64, riscv64 & armhf
https://www.armbian.com
GNU General Public License v2.0
3.85k stars 2.15k forks source link

[Task / Epic]: HUGE repository cleanup ✨ #6820

Open ColorfulRhino opened 1 week ago

ColorfulRhino commented 1 week ago

Task description

The Problem

Over the years, some older stuff was partially removed/not used anymore, but not fully cleaned up. Those files and leftovers in the code still live in the repository, leading to confusions (what is this? is this still used? can this be deleted?) and misleading search/grep results (e.g. for packages/extras-buildpkgs/htop or packages/extras-buildpkgs/hostapd which included changelogs and therefore lots of unrelated text).

Besides that, many blobs were added to the build repo, only some of them still remain. But even the deleted blobs still remain in the repo since Git saves all the history: The history size is huge, even though the current size of the packages/blobs folder is only 55MB. This leads to a unnecesarily bloated repository, increasing the size for everybody. I remember one person often having to visit a local library or university to download/update their Armbian repo since the size was too large on their slow or resticted internet connection at home.

For comparison:

I don't believe that Armbian/build is a bigger project than U-Boot, but it is amlost triple the size in MB. Vastly reducing the repo size (TODO: calculate actual size before/after blob purge) will make contributions more inclusive overall and save time on many occasions.

The Solutions

Removing all known leftover code and moving all blobs to a separate blob repository, like already done with the Rockchip blobs in the Armbian/rkbin repo. After this is done, purge the build repository's history from all the blobs (original idea by @rpardini I believe). The goal is to have a completely blobless Arbian/build repo while blobs are only pulled from other repositories.

Task List

Leftover code:

Blobs:

This task list will be extended with new findings. PRs solving specific tasks will be linked.

This task/story is open for ideas and discussions! 😄


Some statistics for fun and to compare the impact of this cleanup:

Before After Difference
Lines of code 6 458 280 TBD
# of files 6503 TBD
Repo size ~ 636 MB TBD

Commands used:

github-actions[bot] commented 1 week ago

Jira ticket: AR-2391

The-going commented 1 week ago
  • Remove unused packages/extras-buildpkgs/hostapd plus its Realtek part and its traces
  • Remove unused packages/extras-buildpkgs/htop and its traces
  • Remove unused packages/extras-buildpkgs/sunxi-tools and its traces

This code is not used by the build system. And can be deleted. The last minutes of the life of the functions that did something you can see: armbian/build> gitk -- lib/functions/extras

This functionality was designed to build packages in the native "chroot" environment that required library dependencies from the environment.

I am currently still using this, and will be able to bring the code back if users want to use it.

ColorfulRhino commented 1 week ago

This code is not used by the build system. And can be deleted.

Thanks for confirming this!

The last minutes of the life of the functions that did something you can see: armbian/build> gitk -- lib/functions/extras

My build host does not have a graphical user interface, but I get what you mean 😄

If you know of any other unused code, let me know and I'll add it to the list :)

The-going commented 1 week ago

The last minutes of the life of the functions that did something you can see: armbian/build> gitk -- lib/functions/extras

My build host does not have a graphical user interface, but I get what you mean 😄

git log -p -- lib/functions/extras

If you know of any other unused code, let me know and I'll add it to the list

Unfortunately, I'm still in the old paradigm 1.5 years ago.

rpardini commented 1 week ago

I fully agree with the cleanup, but keep in mind git's history will be unaffected, and thus the repo size will only ever become bigger, not smaller. We'd need to rebase things out of existence (rewrite history) and force-push to actually make it smaller. See git filter-branch and git-filter-repo for possible approaches, but it would be very impacting.

ColorfulRhino commented 1 week ago

We'd need to rebase things out of existence (rewrite history) and force-push to actually make it smaller. See git filter-branch and git-filter-repo for possible approaches, but it would be very impacting.

Yes, I think this is what you meant when you were talking about this a long while ago. Please correct me if I'm wrong 😅

My plan is to test this in a completely separate repository (and in a second stage in the main repo but a separate branch), trying to understand its impact and getting opinions of multiple people. This will definitely have to be approved by more than one or two people 😄

The plan is to use one of those tools for this:

The thing is, if we want to do this, the best time is sooner rather than in 3 or 5 years.

rpardini commented 1 week ago

Awesome. You've full understanding.