Sewer56 / sewer56-archives-nx

[WIP / Rust Port] High Performance Archive Format for Mod Assets
Other
0 stars 0 forks source link

V2 ToC Entry Format (Small File Entry Based) #7

Open Sewer56 opened 2 months ago

Sewer56 commented 2 months ago

Motivations

When opening an Nx file, the file is (by definition) supposed to be at least 4K; but ideally, we want to avoid any further reads beyond the 4K margin for a typical mod if possible; therefore minimizing the file size to fit inside 4K is always desireable.

This is an alternative to

Solution

Create a version of the ToC header that is specialised for smaller archives.

Example:

This gives us a header format with max 256 files, provided chunk size is under 1MiB and files themselves are under 256MiB. This is a very common configuration.

Additional context

We want to get all file info in a single 4K header read,

Read Speeds of NVMe SSD (980 Pro)

This library is optimised in mind with smaller mods; most mods' TOC and header fits in within 4K.

On a 980 Pro (currently most popular NVMe), we observe the following latencies (QD 1):

---- Delimited For Margin of Error ----

Although benchmarks for random reads beyond 4K are hard to come by; this is a general pattern that seems to surface from the limited data I could gather. In any case, this is the explanation for header size choice.

String Pool Size

The size of the string pool for an archive of 255 files is approximately 1000 bytes; leaving us roughly 4080 bytes of budget for the entries and blocks. Bit less if considering user data section into account.

This should allow us to fit around 150 files, an improvement of around 43 from the existing format.

Sewer56 commented 2 months ago

Need one for Virtual FileSystems too, where chunk size may cap at 16K. We can do this by reusing the now unused bits past version (due to lower maximums) as additional version fields.