armbian / build

Armbian Linux build framework generates custom Debian or Ubuntu image for x86, aarch64, riscv64 & armhf
https://www.armbian.com
GNU General Public License v2.0
4.15k stars 2.27k forks source link

Armbian repo structure analysis #5180

Open rpardini opened 1 year ago

rpardini commented 1 year ago

What happened?

Analysis of the Armbian repos, in respect to what packages/versions are published into which distro and which components.

How to reproduce?

Upcoming data later.

Branch

main (main development branch)

On which host OS are you observing this problem?

Jammy

Relevant log URL

No response

Code of Conduct

github-actions[bot] commented 1 year ago

Jira ticket: AR-1701

rpardini commented 1 year ago

I wrote a script to grab, parse, and table each package in each distro's components. It's really bad Python but does the job. See the source below. I then used it to analyze both jammy (an old repo) and bookworm (a supposedly brand-new repo). Each result in its CSV file, attached.

Jammy

repo_pkgs_jammy_arm64.csv

Observations, at random:

Bookworm

repo_pkgs_bookworm_arm64.csv

Python script source (stinks)

import urllib.request

base_url = 'https://imola.armbian.com/apt/dists'

dist = "jammy"  # bookworm/jammy etc
wanted_arch = "arm64"  # or amd64, armhf etc

components = ["main", f"{dist}-utils", f"{dist}-desktop"]
arches = [wanted_arch, "all"]

all_repo_pkgs: dict[str, list[dict]] = {}

for component in components:
    for arch in arches:
        all_pkgs_versions: dict[str, list[str]] = {}
        pkg_file_url = f"{base_url}/{dist}/{component}/binary-{arch}/Packages"
        print(f"Downloading {pkg_file_url}...")
        pkg_file_blocks = urllib.request.urlopen(pkg_file_url).read().decode('utf-8').split("\n\n")

        # for each block..
        for pkg_file_block in pkg_file_blocks:
            pkg_name = ""
            pkg_version = ""
            for pkg_file_block_line in pkg_file_block.split("\n"):
                if pkg_file_block_line.startswith("Package: "):
                    pkg_name = pkg_file_block_line.replace("Package: ", "")
                if pkg_file_block_line.startswith("Version: "):
                    pkg_version = pkg_file_block_line.replace("Version: ", "")

            # if we have a name and version, add to all_pkgs
            if pkg_name and pkg_version:
                if pkg_name in all_pkgs_versions:
                    all_pkgs_versions[pkg_name].append(pkg_version)
                else:
                    all_pkgs_versions[pkg_name] = [pkg_version]

        # find the highest version from each package
        all_pkgs_highest_versions: dict[str, str] = {}
        for pkg_name, pkg_versions in all_pkgs_versions.items():
            all_pkgs_highest_versions[pkg_name] = max(pkg_versions)

        # add from all_pkgs_highest_versions to all_repo_pkgs
        for pkg_name, pkg_version in all_pkgs_highest_versions.items():
            if pkg_name not in all_repo_pkgs:
                all_repo_pkgs[pkg_name] = []
            all_repo_pkgs[pkg_name].append({
                "version": pkg_version,
                "compo-arch": f"{component}-{arch}",
                "component": component,
                "arch": arch
            })

# sort all_repo_pkgs by key
all_repo_pkgs = dict(sorted(all_repo_pkgs.items()))

# all done, lets make a table, where pkg_name is in the rows, and the columns are the different compo-arch's, and the values are the versions
table: dict[str, dict[str, str]] = {}
for pkg_name, pkg_versions in all_repo_pkgs.items():
    table[pkg_name] = {}
    for pkg_version in pkg_versions:
        table[pkg_name][pkg_version["compo-arch"]] = pkg_version["version"]

# print the table, in CSV format, with headers.
csv_out = []
csv_out.append("Package," + ",".join([f"{component}-{arch}" for component in components for arch in arches]))
for pkg_name, pkg_versions in table.items():
    for pkg_version in pkg_versions:
        csv_out.append(
            f"{pkg_name},{','.join([table[pkg_name][f'{component}-{arch}'] if f'{component}-{arch}' in table[pkg_name] else '' for component in components for arch in arches])}")

# write the CSV file
with open(f"repo_pkgs_{dist}_{wanted_arch}.csv", "w") as f:
    f.write("\n".join(csv_out))