archiecobbs / s3backer

FUSE/NBD single file backing store via Amazon S3
Other
529 stars 75 forks source link

Choosing --blockSize #170

Closed HaleTom closed 2 years ago

HaleTom commented 2 years ago

What are the recommendations / heuristics for selecting the value for --blockSize?

Immutability

Read amplification

I'm guessing blockSize will need to be <= the size of the sector size of device or filesystem placed on top, so as to avoid reading a larger block to retrieve just one part of it (the sector).

Write amplification

Same as read amplification, but there is extra overhead as the non-changed data also needs to be written back.

Used block bitmap

With --listBlocks, the load time will be longer if more blocks need to listed to represent the same data.

Overheads

Writing each block has a certain amount of S3 API overhead. Larger blocks presumably reduce this overhead percentage, but that is not shown in #158 (see below).

Write speed and interactions with cache size

Presumably, the cache should be large enough so that it can contain sufficient blocks to allow for the desired parallelism of blocks being written simultaneously.

Prior art:


archiecobbs commented 2 years ago

Can you please confirm that once a --blockSize is set, it cannot be changed?

Correct.

Read amplification Write amplification

Can't really make any blanket statements because it all depends on lots of details (upper filesystem, kernel behavior, caching, etc). But in general (a) if the kernel reads and writes data in --blockSize chunks that's more efficient, but also (b) there is some fixed per-block overhead when reading/writing to/from S3.

Am I right in thinking that this is a bitmap, with one bit per block (used / not used)?

Yes.

Writing each block has a certain amount of S3 API overhead. Larger blocks presumably reduce this overhead percentage, but that is not shown in https://github.com/archiecobbs/s3backer/issues/158 (see below).

Regarding issue #158, it is still unclear what is actually happening there.

Presumably, the cache should be large enough so that it can contain sufficient blocks to allow for the desired parallelism of blocks being written simultaneously.

Yes, that makes sense.

HaleTom commented 2 years ago

Cheers for the feedback!

I'll update this if/when I generate performance figures. For now the man page suggests 1M as an example block size.

HaleTom commented 2 years ago

A manual note on >16T devices that's not listed beside --blockSize:

For cache space efficiency, s3backer uses 32 bit values to index individual blocks. Therefore, the block size must be increased beyond the default 4K when very large filesystems (greater than 16 terabytes) are created.

Glad I clocked that one as I'm going for 1P. Would you consider moving both (separated) BUGS notes next to --blockSize where most people will read them?

HaleTom commented 2 years ago

Oh, I just noticed your first BUGS paragraph is duplicated... if only every bug was as easy to "remove" :)

archiecobbs commented 2 years ago

Oh, I just noticed your first BUGS paragraph is duplicated

Not sure what you mean... what exactly are you seeing is duplicated?

HaleTom commented 2 years ago

Not sure what you mean... what exactly are you seeing is duplicated?

Sorry, PEBCAK: less seek display glitch (or just double-sightedness?)

archiecobbs commented 2 years ago

OK no problem. Closing this issue but feel free to reopen if you have more suggestions. Thanks.