beyond-all-reason / spring

A powerful free cross-platform RTS game engine
https://beyond-all-reason.github.io/spring/
Other
201 stars 98 forks source link

Investigate alternative Rapid on-disk storage format #1183

Open p2004a opened 8 months ago

p2004a commented 8 months ago

Problem

Rapid format is serving us relatively well because it provides good incremental updates support.

The major disadvantage of the current Rapid on-disk storage format is that it's rather slow for the game load performance:

  1. All files are compressed using gzip

    Currently in BAR the load cost because of gzip is ~2s on my tests from some time ago: https://discord.com/channels/549281623154229250/724924957074915358/1132367444837683240

  2. Every single archive file is stored on disc separately

    BAR has over 10000 files. There is high overhead because of all the syscalls and it's especially terrible on Windows. BAR has workarounds that open and close all pool files in lobby to reduce the load times by prewarming OS caches and triggering Antivirus on-open scans.

This issue is to investigate more performant solutions that still offer good support for incremental updates.

Compression

For the compression: zstd and l4 are great options. zstd has the compression ratios very similar to gzip, but a much faster decompression. lz4 decompression is in order of GB/s so it's close to transparent, and it makes it more feasible for software like pr-downloader to re-compress objects on the fly while downloading, without changing the content distribution format.

Many small files

For the many small files issue, we can investigate using an embeded key-value store databases like LevelDB, RocksDB, and LMDB. By storing the files in the embedded database, they can provide optimized storage access, and fully incremental changes just like we offer at the moment.

LMDB focuses on the read performance, supports out of the box concurrent access from multiple independent processes (pretty rare for embedded databases, and is important for the existing API usage and engine <-> pr-downloader interaction), has great platform support, small code base, and modern C++ bindings if we want.

The main unknowns in this approach are:

sprunk commented 8 months ago

Perhaps some sort of "clear pool and consolidate into a single .sdz archive" option on top of existing rapid? BAR wants incremental because it has no release cycle, but maybe this won't be a worry longterm because it should eventually have one. And then a release is a perfect moment to consolidate the rapid pool.

ZK does something equivalent to the above - when there's a release we build an .sdz archive via infra and distribute it via Steam, but this wouldn't necessarily be needed if the archive could be built locally by clients from pool.

p2004a commented 8 months ago

Interesting idea, such local consolidation into a sdz/sd7 archive could be some sort of middle ground that's also worth considering. Some observations:

At the moment I would personally still gravitate slightly more towards trying out the embedded database, as it might be in the end an overall simpler architecture and code, but it would need to be tried out to confirm.