veloren / Airshipper

The official Veloren launcher! This repository is a mirror. Please submit all PRs and issues on our GitLab page: https://gitlab.com/veloren/airshipper
https://veloren.net/download
GNU General Public License v3.0
257 stars 33 forks source link

Use extant algorithms to diff Veloren updates #151

Open Marie-Joseph opened 3 years ago

Marie-Joseph commented 3 years ago

Is your feature request related to a problem? Please describe. It's extremely annoying that I have to re-download the entirety of Veloren when updating through Airshipper because I have slow (if not metered) internet. (And until something like this is implemented, I'll probably switch back to Flatpak for this very reason.)

Describe the solution you'd like There are at least a few extant, well-tested, formally-verified algorithms for diff-ing binary data. Flatpak uses one, tarsnap uses one, and even many Windows programs use one. Airshipper should also use one. One option for making this work across all data being downloaded (I'm not sure exactly how Airshipper, well, ships files) is to make essentially a tarball, treat it as binary data, and diff versions that way, sending only what is needed (see, again, tarsnap and Flatpak). Any archive format ought to work just as well, and since this would be in Airshipper rather than the distribution medium, it would be as cross-platform as Airshipper itself. I keep mentioning tarsnap and Flatpak because, firstly, tarsnap uses an algorithm developed in academia (it got its creator a PhD) and is openly-licensed with the source code available - from what I understand, it's the best one around today. Flatpak also uses a pretty good algorithm and is an open project. This means they can be studied for implementation details. Additionally, wget has the ability to resume downloads of arbitrary data, suggesting it uses something like this as well and thus could be studied. Finally, Debian's Jigdo has similar capabilities, although from what I can tell it uses older techniques because it relies on things like an extra database file and a template, which at least wget does not.

Describe alternatives you've considered This is the standard way to handle the problem of updating software over the internet.

Additional context This would help with both #63 (saves bandwith by default) and #125 (the download-so-far can be part of the version diff'd with the server).

Songtronix commented 3 years ago

make essentially a tarball, treat it as binary data, and diff versions that way

so you got to keep two versions of the game? One as tarball you can apply the patches to whenever there's an update (note: this requires sequentially applying patches since you last updated the game or entirely redownloading. Alternatively creating sort of "superpatches" which cover multiple versions but bloat the server dramatically.) and the other extracted for actual gameplay. Otherwise you would be required to download the tarball regardless anyway. As the game grows which will surely happen I don't think you want to store double the size needed on the client.

Currently I implement a file based approach as described in #63 and create an manifest which compress pretty okay and contains all files with their expected hashes such as the client can download (in parallel) all mismatching files. Downside might be that downloading a lot of small files is less efficient than one big file which can saturate your bandwidth.

Either way there's a trade off to make and as far as I see there are not any good ressources nor crates available for this to be implemented fast and easy. My time is really limited at the moment and I cannot spend hours of research to reimplement binary patching correctly and fault-resistant. So if to be implemented it will be after 0.5.

This is the standard way to handle the problem of updating software over the internet.

source? Would love to read up on standard ways of updating software as so far everyone I've seen has their own variants and often closed source like steam do not expose that much information.

Please correct me if I understood something wrong.

Songtronix commented 3 years ago

I've been made aware of https://crates.io/crates/bita which sort of is like the approach mentioned here. I will experiment with both approaches and see what is currently the best decision to move forward with.

Kyc0k3a60pa commented 7 months ago

I understand correctly. Is this case not implemented today?