containers / storage

Container Storage Library
Apache License 2.0
539 stars 234 forks source link

chunked: store cache as binary and use a bloom filter #1870

Closed giuseppe closed 2 months ago

giuseppe commented 3 months ago

The bloom filter itself is useful to reduce page faults with the mmap'ed cache files, as it reduces lookups.

Storing the file as a binary instead reduces the file size considerably, with the quay.io/giuseppe/zstd-chunked:fedora-{38,39,40}{,-updated} images I see:

before:

# find -name '=Y2h1bmtlZC1tYW5pZmVzdC1jYWNoZQ==' -exec stat -c '%s' \{\} \;2547644
2575163
2547644
2476816
2462835
2533346

after:

# find -name '=Y2h1bmtlZC1tYW5pZmVzdC1jYWNoZQ==' -exec stat -c '%s' \{\} \;
1319206
1312332
1275803
1270629
1297565

so it is ~50% size reduction

openshift-ci[bot] commented 3 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/containers/storage/blob/main/OWNERS)~~ [giuseppe] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
giuseppe commented 3 months ago

@kolyshkin @mtrmac @rhatdan some more improvements to the cache file

giuseppe commented 2 months ago

the PR is ready for review

rhatdan commented 2 months ago

@mtrmac needs another review.

giuseppe commented 2 months ago

I've fixed your comments, except https://github.com/containers/storage/pull/1870#discussion_r1556134916. What would you like me to do here?

mtrmac commented 2 months ago

I've fixed your comments, except #1870 (comment). What would you like me to do here?

https://github.com/containers/storage/pull/1870/files#r1557856169 , or perhaps I’m missing something.

rhatdan commented 2 months ago

@giuseppe This is waiting on you now?

giuseppe commented 2 months ago

thanks @mtrmac and @kolyshkin. I've addressed your last comments and pushed a new version

mtrmac commented 2 months ago

/lgtm

Thanks!