accrescent / meta

Umbrella Accrescent issue tracker
6 stars 1 forks source link

Support delta updates #10

Open lberrymage opened 2 years ago

lberrymage commented 2 years ago

Although split APKs and compression greatly reduce download sizes, we can reduce them further by caching previous app downloads on the client and generating deltas to patch what's already downloaded. It doesn't really matter what format the deltas take as long as it can be feasibility implemented in both the client and server codebases. Ideally we should use a popular existing format with well-supported libraries for delta generation in Go (for the developer portal) and delta application in Kotlin/Java (for the client). However, since this is a pretty niche need I'm not opposed to writing/translating/maintaining a new delta generator/patcher for an existing format if need be or maybe using C bindings. Chromium's fork of bsdiff looks like an interesting candidate.

Considering how frequently Accrescent checks for updates and that it performs them without user interaction, we probably only want to generate deltas between the last release and the current release.

lberrymage commented 2 years ago

On second thought we might not need to cache previous downloads to patch against them. https://developer.android.com/reference/android/content/pm/ApplicationInfo#sourceDir and https://developer.android.com/reference/android/content/pm/ApplicationInfo#splitSourceDirs allow us to extract installed APKs as needed.

lberrymage commented 2 years ago

Here are some stats for bsdiff:

Sample patch sizes for Accrescent:

0.1.0_0.2.0.patch: 1078153 bytes
0.2.0_0.2.1.patch: 10343 bytes
0.2.1_0.2.2.patch: 427772 bytes
0.2.2_0.3.0.patch: 727669 bytes
0.3.0_0.3.1.patch: 1095496 bytes
0.3.1_0.4.0.patch: 802749 bytes

Avg. patch size: 690364 bytes
Avg. size savings compared to downloading next full update: 3454474 bytes (83.6% cut)

For an example with larger sizes:

Sample patch sizes for Organic Maps:

22021117_22021629.patch: 24134508 bytes
22021629_22021901.patch: 23853112 bytes
22021901_22032304.patch: 31934433 bytes
22032304_22042702.patch: 25466343 bytes
22042702_22052402.patch: 25233031 bytes
22052402_22053110.patch: 25239447 bytes

Avg. patch size: 25976812 bytes
Avg. size savings compared to downloading next full update: 33171765 bytes (56.1% cut)

These numbers aren't comparable across apps for obvious reasons, but they should give a rough idea of how useful this feature would really be.

lberrymage commented 2 years ago

Searching suggests the Play Store uses bsdiff for delta updates.

lberrymage commented 2 years ago

bsdiff is simple enough that we can feasibly rewrite it in Rust. A naive translation of https://github.com/mendsley/bsdiff is acceptable as an MVP but it may be desirable to implement the bsdiff 6 improvements outlined in https://www.daemonology.net/papers/thesis.pdf.

lberrymage commented 1 year ago

Note that only the bsdiff code itself, that is, the suffix sorting and patch generation logic, is feasible for us to re-implement in Rust. We will likely be forced to use C bindings to bzip2 for the compression portion of the algorithm, in which case delta patching should be done within an isolatedProcess, adhering to the rule of 2.

lberrymage commented 1 year ago

We most likely want to use zstd instead of bzip2 since bzip2 is largely unmaintained.

xtaltal commented 1 month ago

We most likely want to use zstd instead of bzip2 since bzip2 is largely unmaintained.

Maybe in the future compare a few compression algorithms at some point across all applications in the Accrescent store or maybe even other stores (F-Droid?) (would that be legal?). Bzip3 comes to mind. bsc.

lberrymage commented 1 month ago

A detailed comparison could be useful, yes. There shouldn't be a legal issue with using F-Droid apps for that kind of testing. I will say, though, that zstd is much more widely used, actively maintained, and actively tested than many of the alternatives, so I doubt much will beat it out from a security standpoint.

TheScreechingBagel commented 1 week ago

there's this rust brotli implementationthat's maintained by dropbox, published by servo(?) and seems to be relatively widely used

lberrymage commented 1 week ago

I've looked at it before, and it's definitely worth considering, especially since it was designed for strict seccomp filtering: https://dropbox.tech/infrastructure/lossless-compression-with-brotli.