dispatchrun / timecraft

The WebAssembly Time Machine
https://docs.timecraft.dev
GNU Affero General Public License v3.0
329 stars 6 forks source link

tarfs: use null-terminated strings for directory entries #198

Closed achille-roussel closed 1 year ago

achille-roussel commented 1 year ago

I'm opening this PR just because I wanted to measure the impact of the change, but I don't think it's worth the amount of unsafe operation it introduces; I'll close it for now but we can bring it back if we ever have a stronger use case for it.


Based on #197, this PR represents directory entry names as null-terminated strings to reduce their memory footprint. Go strings are stored as a pair of pointer and length, which holds 16 bytes on 64 bits architectures. By using C-style strings we only need one extra byte to represent the end of string, reducing the overhead to 9 bytes per string instead.

To ensure that we are banking the savings, a simple allocator packs small strings next to each other in memory and avoids the overhead of metadata maintained by the Go memory allocator.

The optimization reduces the memory footprint of the Alpine file system by 7% compared to the parent branch, totaling 65% reduction from the first implementation:

=== RUN   TestAlpine
    tarfs_test.go:77: Size     = 5577728
    tarfs_test.go:78: Memsize  = 52761 (0.95%)
    tarfs_test.go:79: Filesize = 5290340 (94.85%)
=== RUN   TestAlpine
    tarfs_test.go:77: Size     = 5577728
    tarfs_test.go:78: Memsize  = 49149 (0.88%)
    tarfs_test.go:79: Filesize = 5290340 (94.85%)