Aldaviva / RaspberryPiDotnetRepository

🥧 APT repo of .NET runtime and SDK binary ARM DEB packages for Raspberry Pi OS
https://raspbian.aldaviva.com/addrepo.sh
Apache License 2.0
3 stars 0 forks source link

Store authoritative copy of packages on Azure Blob Storage instead of self-hosted web server for faster uploads to CDN #5

Closed Aldaviva closed 2 months ago

Aldaviva commented 7 months ago

Currently, the network architecture of this repository is Raspberry Pi → Azure CDN → self-hosted origin web server thanks to #2.

A potential improvement is to move the authoritative source of the repo files from the self-hosted web server to Azure Blob Storage (like S3), which has a very fast connection to Azure CDN. Estimated additional hosting costs are about $0.16/month USD. Raspberry Pi → Azure CDNAzure Blob Storage ← repo generator.

I've already created a blob storage account and container, and manually uploaded the most recent packages and metadata files for .NET 8.0.4.

Changes required to make repo generator program use blob storage directly

  1. [x] Assume no packages or metadata files are stored locally (easy)
  2. [x] Get rid of the most recently seen JSON file, we don't need that any more (easy)
  3. [x] Download current .NET release index JSON files from Microsoft (existing functionality)
  4. [x] Download and parse a new repo index JSON file from blob storage, handle missing (easy)
  5. [x] If the repo index JSON file was generated against up-to-date .NET and Debian versions, then stop (mostly done)
  6. [x] Generate each package locally that does not exist in blob storage already (mostly done)
  7. [x] Upload each new package to blob storage container, capped at # concurrent uploads using DataFlow, with correct content-type value (easy)
  8. [x] Generate updated package index files, including new and unchanged packages, and excluding outdated packages (mostly done)
  9. [x] Generate updated release index files based on updated package index files, and sign them (mostly done)
  10. [x] Upload updated package and release index files to blob storage, also with correct content-type headers (easy)
  11. [x] Generate and upload new repo index JSON file with all the packages that exist in the repo now, as well as the upstream .NET and Debian versions the repo was generated against (easy)
  12. [x] Delete outdated package files from blob storage (easy)
  13. [x] Purge CDN (existing functionality)
  14. [ ] Delete local temporary working files like upstream SDK downloads, packages, and any index files, unless configured otherwise (easy)
Aldaviva commented 7 months ago

Regular expression patterns to split and extract key-value pairs from package metadata control or index files:

\n{2,}
/(?<key>[\w-]+): (?<value>.+?)(?:\n(?! )|$)/gs

Although I could also just serialize everything into, for example, one big JSON or XML file, so I don't have to write a control file parser.

Aldaviva commented 7 months ago

Azure CDN (Classic) by Microsoft does not have CDN preloading, unlike the EdgeIO CDN that I don't want to use because it's related to Verizon. Manual preloading by requesting each file would increase the billable traffic, take a long time, and would also probably only cache each file on one CDN edge server closest to the preloading client instead of all CDN servers, so any Raspberry Pis in different regions probably wouldn't benefit from the preloading.

Aldaviva commented 7 months ago

Tutorial on accessing storage blobs programmatically

Aldaviva commented 3 months ago

Repo regenerated fresh and uploaded to blob storage successfully. Pointed Raspberry Pi 3 running Bookworm ARM32 at the test CDN host. It updated an existing .NET Runtime 8 installation successfully, as well as installing the .NET 8 SDK, and compiling and running a program.

Speeds seem to be much faster (tested over Ethernet):

Get:1 https://raspbian2.azureedge.net bookworm/main armhf dotnet-cli armhf 8.0.7-0 [25.2 kB]
Get:2 https://raspbian2.azureedge.net bookworm/main armhf dotnet-runtime-8.0 armhf 8.0.7-0 [29.3 MB]
Fetched 29.4 MB in 4s (6,788 kB/s)
Get:1 https://raspbian2.azureedge.net bookworm/main armhf aspnetcore-runtime-8.0 armhf 8.0.7-0 [10.6 MB]
Get:2 https://raspbian2.azureedge.net bookworm/main armhf dotnet-sdk-8.0 armhf 8.0.303-0 [178 MB]
Fetched 188 MB in 22s (8,734 kB/s)
Aldaviva commented 3 months ago

Need to see if this can successfully perform a delta update on an existing repository in Blob Storage for a new .NET version, like the 8.0.8 update that was released after the repo was last built.

Aldaviva commented 3 months ago

Uploaded partial repository regeneration to Blob Storage for .NET 8.0.8 update.

Migrated production CDN https://raspbian.aldaviva.com to point to Blob Storage instead of my on-premises server, so blob-backed repo is now live.

Aldaviva commented 2 months ago

Will address cleanup later in a different task.