holiman / billy

Very simple datastore
BSD 3-Clause "New" or "Revised" License
55 stars 8 forks source link

shelf: use files capped to certain size #19

Open holiman opened 1 year ago

holiman commented 1 year ago

This PR adds 'cappedFile' support. A cappedFile behaves like a regular os.File, but it actually maps to a set of files, each capped to a max size. By swapping out the regular files to cappedFile, as backing for the shelves, billy will can be made to respect max file sizes in filesystem (e.g 4GB in fat32).

The cappedFile is not concurrency-safe for spread-out read/writes. That is, if the data to be read crosses file boundaries, then simultaneous read and write may cause data to be corrupted.

However, this can be easily avoided on the upper layer: the shelf can just ensure that the cappedFile limit is a multiple of the shelf size. So instead of using 2 * 1024 * 1024= 2097152, for shelf-size 10, it could use 2097150. If it did, then the write-offsets (2097140, 2097150,2097160) all occur so no writes crosses file boundaries.

holiman commented 1 year ago

cc @karalabe

codecov[bot] commented 1 year ago

Codecov Report

Merging #19 (a920bfe) into main (1c7e68d) will increase coverage by 0.28%. The diff coverage is 89.89%.

@@            Coverage Diff             @@
##             main      #19      +/-   ##
==========================================
+ Coverage   87.08%   87.37%   +0.28%     
==========================================
  Files           5        6       +1     
  Lines         395      483      +88     
==========================================
+ Hits          344      422      +78     
- Misses         36       42       +6     
- Partials       15       19       +4     
holiman commented 1 year ago

Changed the behaviour now, but there's something not right still

[user@work billyfuzz]$ go run . 
Opened ./
1 ops, 2 keys active
Reopening db, ops 1386, keys 703
Opened ./
Reopening db, ops 2288, keys 1019
Opened ./
2289 ops, 1018 keys active
panic: bad index: slot 245, slotsize 1048576, EOF

goroutine 1 [running]:
main.doFuzz(0xc00011c780?)
        /home/user/go/src/github.com/holiman/billy/cmd/billyfuzz/main.go:133 +0xc5b
github.com/urfave/cli/v2.(*Command).Run(0xc00011c780, 0xc00002e380, {0xc000014050, 0x1, 0x1})
        /home/user/go/pkg/mod/github.com/urfave/cli/v2@v2.24.1/command.go:271 +0x9eb
github.com/urfave/cli/v2.(*App).RunContext(0xc00017c000, {0x5ffbd8?, 0xc000018110}, {0xc000014050, 0x1, 0x1})
        /home/user/go/pkg/mod/github.com/urfave/cli/v2@v2.24.1/app.go:333 +0x665
github.com/urfave/cli/v2.(*App).Run(...)
        /home/user/go/pkg/mod/github.com/urfave/cli/v2@v2.24.1/app.go:310
main.main()
        /home/user/go/src/github.com/holiman/billy/cmd/billyfuzz/main.go:51 +0x1c5
holiman commented 1 year ago

but there's something not right still

fixed now

holiman commented 1 year ago

@karalabe I think the current implementation is the "most sane". Want to have a review chat about this at some point?