archiecobbs / s3backer

FUSE/NBD single file backing store via Amazon S3
Other
529 stars 75 forks source link

cache file expanded by fractional block size resulting in error at startup #222

Closed rbfnet closed 3 months ago

rbfnet commented 4 months ago

I often (but certainly not every time) experience this error at startup: ERROR: cache file ' ... ' is truncated (has size 72679424 < 73502720 bytes) The sizes vary, but when it happens, the second number is always a valid cache file size (the file and slot headers plus some integer multiple of the block size), and the first number is an invalid cache file size (the file and slot headers plus some non-integral multiple of the block size).

(I can recover by extending using fallocate to extend the file to the second number; of course, this effectively extends it with 0x00's, but I've never lost data, so my assumption is that the "missing" data was all zeros to begin with, which is consistent with what I suggest below.)

I do not have a test case, but following the code paths, I think what is happening is that if s3b_dcache_write_block_falloc is called with a fractional block (i.e. len is not a multiple of the block size), and it ends up extending the cache file, it will extend it by a fractional block rather than a full block. (Probably _simple does the same thing, but my system has and uses the falloc path.)

I expect this occurs when a never-before-used block is fractionally written while the cache file is not at maximum size (so it gets extended to hold the block that just got partially written).

This won't be an issue until the next restart, at which time the initialization will fail because the cache file will not be a valid size. (Actually, it might be an issue if there is an attempt to read the portion of the block that is supposedly in the cache, but beyond the end of the cache file ... I haven't completely thought about that ... but in practice, I've never seen that happen, because filesystems generally do not read portions of the disk to which they have not written.)

If my theory is correct, a fix, although perhaps not the right fix, would be to modify s3b_dcache_write_block_falloc as follows:

(a) At the start, store the file size:

 u_int original_file_size = priv->file_size;

(b) At the end, adjust the file size as needed to insure that any increase is a full block:

 u_int growth = priv->file_size - original_file_size;
 u_int additional = ROUNDUP2(growth, priv->block_size) - growth;
 fallocate(priv->fd, 0, priv->file_size, additional);
 priv->file_size += additional;

(Something similar would be needed for s3b_dcache_write_block_simple, I think, but presumably you'd want to use pwrite() instead of fallocate().)

The above is untested; I mostly put it in this issue just to clearly illustrate what I think this problem is. There are probably fixes cleaner than the brute force I show above.

rbfnet commented 4 months ago

This is probably a better fix. It builds, but I still don't have a reliable reproduction to test it against.


--- dcache.c.orig    2024-04-23 09:55:27.000000000 -0500
+++ dcache.c         2024-05-16 10:33:42.131446830 -0500
@@ -580,7 +580,8 @@
 int
 s3b_dcache_write_block_falloc(struct s3b_dcache *priv, u_int dslot, const char *src, const u_int doff, u_int len)
 {
-    const off_t orig_off = DATA_OFFSET(priv, dslot) + doff;
+    const off_t block_off = DATA_OFFSET(priv, dslot);
+    const off_t orig_off = block_off + doff;
     off_t off = orig_off;
     off_t roundup_off;
     u_int num_zero_blocks;
@@ -593,6 +594,13 @@
     assert(len <= priv->block_size);
     assert(doff + len <= priv->block_size);

+    // If we need to extend the file, do it by a full block size.
+    if (block_off + priv->block_size > priv->file_size) {
+        priv->file_size = block_off + priv->block_size;       
+        if (ftruncate(priv->fd, priv->file_size) != 0)
+            return errno;
+    }       
+
     // Write unaligned leading bit, if any
     roundup_off = ROUNDUP2(off, (off_t)priv->file_block_size);
     extra_len = (u_int)(roundup_off - off);
@@ -618,13 +626,6 @@
         if (num_zero_blocks > 0) {
             const u_int zero_len = num_zero_blocks * priv->file_block_size;

-            // Extend file if necessary
-            if (off + zero_len > priv->file_size) {
-                if (fallocate(priv->fd, 0, off, zero_len) != 0)
-                    return errno;
-                priv->file_size = off + zero_len;
-            }
-
             // "Write" zeros using FALLOC_FL_PUNCH_HOLE
             if (fallocate(priv->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, off, zero_len) != 0)
                 return errno;
@@ -712,6 +713,13 @@
     assert(len <= priv->block_size);
     assert(off + len <= priv->block_size);

+    // If we need to extend the file, do it by a full block size.
+    if (DATA_OFFSET(priv, dslot) + priv->block_size > priv->file_size) {
+        priv->file_size = DATA_OFFSET(priv, dslot) + priv->block_size;       
+        if (ftruncate(priv->fd, priv->file_size) != 0)
+            return errno;
+    }       
+
     // Write data
     if ((r = s3b_dcache_write(priv, DATA_OFFSET(priv, dslot) + off, src != NULL ? src : zero_block, len)) != 0)
         return r;```
archiecobbs commented 4 months ago

Thanks - I think your analysis is correct.

Should be fixed in cdc356a.

Please update, reconfigure with --enable-assertions, and rebuild and see if that fixes the problem (and you don't see any assertion failures) and let me know how it goes. Thanks.

rbfnet commented 2 months ago

I realize this is already closed ... but just wanted to confirm that I did get this built about a month ago and I've seen no new issues, and the original issue has not recurred since then.

archiecobbs commented 2 months ago

Great! I appreciate the positive confirmation.