Sewer56 / sewer56-archives-nx

[WIP / Rust Port] High Performance Archive Format for Mod Assets
Other
0 stars 0 forks source link

[Low Priority] V2 StringPool #3

Closed Sewer56 closed 2 months ago

Sewer56 commented 2 months ago

Basically storing string lengths instead of null terminated strings.

The StringPool will start with a section of string lengths. 1 byte per string. After those lengths are the strings themselves, without terminators.

This stops us from having to scan for null terminators; improving parse time.

By placing the string lengths first, we can also improve compression efficiency instead of simple length prefixed strings.

Sewer56 commented 2 months ago

New pool is V1, a.k.a. VPrefix

create_string_pool_1000_V0
                        time:   [5.0506 ms 5.0615 ms 5.0756 ms]
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) high mild
  11 (11.00%) high severe

[create_string_pool_1000_V0] Packed size: 4566 bytes
unpack_string_pool_1000_V0
                        time:   [23.334 µs 23.365 µs 23.395 µs]

[unpack_string_pool_1000_V0] Unpacked size: 43388 bytes
create_string_pool_1000_V1
                        time:   [5.1446 ms 5.1455 ms 5.1465 ms]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

[create_string_pool_1000_V1] Packed size: 5120 bytes
unpack_string_pool_1000_V1
                        time:   [17.449 µs 17.456 µs 17.465 µs]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

[unpack_string_pool_1000_V1] Unpacked size: 43388 bytes
create_string_pool_2000_V0
                        time:   [13.418 ms 13.431 ms 13.447 ms]
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

[create_string_pool_2000_V0] Packed size: 8003 bytes
unpack_string_pool_2000_V0
                        time:   [45.611 µs 45.618 µs 45.625 µs]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

[unpack_string_pool_2000_V0] Unpacked size: 93838 bytes
create_string_pool_2000_V1
                        time:   [13.206 ms 13.212 ms 13.219 ms]
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

[create_string_pool_2000_V1] Packed size: 8795 bytes
unpack_string_pool_2000_V1
                        time:   [38.185 µs 38.193 µs 38.202 µs]
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

[unpack_string_pool_2000_V1] Unpacked size: 93838 bytes
create_string_pool_4000_V0
                        time:   [32.254 ms 32.280 ms 32.320 ms]
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  4 (4.00%) high severe

[create_string_pool_4000_V0] Packed size: 12802 bytes
unpack_string_pool_4000_V0
                        time:   [89.005 µs 89.116 µs 89.253 µs]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

[unpack_string_pool_4000_V0] Unpacked size: 200343 bytes
create_string_pool_4000_V1
                        time:   [31.440 ms 31.452 ms 31.465 ms]
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

[create_string_pool_4000_V1] Packed size: 14218 bytes
unpack_string_pool_4000_V1
                        time:   [67.717 µs 67.734 µs 67.750 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

[unpack_string_pool_4000_V1] Unpacked size: 200343 bytes

Will keep the code around, but will be unused.

Sewer56 commented 2 months ago

VPrefix being removed in commit after d02f912ef610ef95f997c060b8f0148fc8686253 because I don't want to be hardening code I won't use. It's a security risk.