Armbian repo structure analysis #5180

Open rpardini opened 1 year ago

rpardini commented 1 year ago

What happened?

Analysis of the Armbian repos, in respect to what packages/versions are published into which distro and which components.

How to reproduce?

Upcoming data later.


main (main development branch)

github-actions[bot] commented 1 year ago

rpardini commented 1 year ago

I wrote a script to grab, parse, and table each package in each distro's components. It's really bad Python but does the job. See the source below. I then used it to analyze both jammy (an old repo) and bookworm (a supposedly brand-new repo). Each result in its CSV file, attached.



Observations, at random:



Python script source (stinks)

import urllib.request

base_url = ''

dist = "jammy"  # bookworm/jammy etc
wanted_arch = "arm64"  # or amd64, armhf etc

components = ["main", f"{dist}-utils", f"{dist}-desktop"]
arches = [wanted_arch, "all"]

all_repo_pkgs: dict[str, list[dict]] = {}

for component in components:
    for arch in arches:
        all_pkgs_versions: dict[str, list[str]] = {}
        pkg_file_url = f"{base_url}/{dist}/{component}/binary-{arch}/Packages"
        print(f"Downloading {pkg_file_url}...")
        pkg_file_blocks = urllib.request.urlopen(pkg_file_url).read().decode('utf-8').split("\n\n")

        # for each block..
        for pkg_file_block in pkg_file_blocks:
            pkg_name = ""
            pkg_version = ""
            for pkg_file_block_line in pkg_file_block.split("\n"):
                if pkg_file_block_line.startswith("Package: "):
                    pkg_name = pkg_file_block_line.replace("Package: ", "")
                if pkg_file_block_line.startswith("Version: "):
                    pkg_version = pkg_file_block_line.replace("Version: ", "")

            # if we have a name and version, add to all_pkgs
            if pkg_name and pkg_version:
                if pkg_name in all_pkgs_versions:
                    all_pkgs_versions[pkg_name] = [pkg_version]

        # find the highest version from each package
        all_pkgs_highest_versions: dict[str, str] = {}
        for pkg_name, pkg_versions in all_pkgs_versions.items():
            all_pkgs_highest_versions[pkg_name] = max(pkg_versions)

        # add from all_pkgs_highest_versions to all_repo_pkgs
        for pkg_name, pkg_version in all_pkgs_highest_versions.items():
            if pkg_name not in all_repo_pkgs:
                all_repo_pkgs[pkg_name] = []
                "version": pkg_version,
                "compo-arch": f"{component}-{arch}",
                "component": component,
                "arch": arch

# sort all_repo_pkgs by key
all_repo_pkgs = dict(sorted(all_repo_pkgs.items()))

# all done, lets make a table, where pkg_name is in the rows, and the columns are the different compo-arch's, and the values are the versions
table: dict[str, dict[str, str]] = {}
for pkg_name, pkg_versions in all_repo_pkgs.items():
    table[pkg_name] = {}
    for pkg_version in pkg_versions:
        table[pkg_name][pkg_version["compo-arch"]] = pkg_version["version"]

# print the table, in CSV format, with headers.
csv_out = []
csv_out.append("Package," + ",".join([f"{component}-{arch}" for component in components for arch in arches]))
for pkg_name, pkg_versions in table.items():
    for pkg_version in pkg_versions:
            f"{pkg_name},{','.join([table[pkg_name][f'{component}-{arch}'] if f'{component}-{arch}' in table[pkg_name] else '' for component in components for arch in arches])}")

# write the CSV file
with open(f"repo_pkgs_{dist}_{wanted_arch}.csv", "w") as f: