outcaste-io / issues

File issues here across all public Outcaste Repositories
Apache License 2.0
6 stars 0 forks source link

[Badger] Unable to read data following disk exhaustion #24

Closed jordanschalm closed 2 years ago

jordanschalm commented 2 years ago

Context

The machine hosting our Badger Database instance ran out of disk space. After resizing the disk and restarting the instance, we are unable to read some data.

Our issue looks similar to this issue reported on the Dgraph Forum: https://discuss.dgraph.io/t/unable-to-read-some-data/11051

Initial problem

The disk is exhausted, we see errors when inserting new values like:

Unable to write to value log file: \"/dir/002587.vlog\": write /dir/002587.vlog: no space left on device

We resize the disk and restart the instance. The database opens successfully (does not require truncate). Upon reading recently written values required to initialize our application logic, we see Badger logs:

{"level":"error","time":"2022-03-31T04:53:26Z","message":"Unable to read: Key: [34 19 49 66 150 122 158 132 178 53 55 36 133 94 96 233 217 231 132 125 245 68 69 204 160 82 158 149 186 79 9 203 120], Version : 26845631,\n\t\t\t\tmeta: 66, userMeta: 0"}
{"level":"error","time":"2022-03-31T04:53:26Z","message":"Invalid read: vp: {Fid:2587 Len:2119 Offset:383311220}"}

The read (Txn.Get) returns an error:

Invalid read: Len: 2107 read at:[1181425:1183111]

Resolution Attempts

Truncate Database

After opening the database in truncate mode, we still see the same error above when reading some values.

Manually remove values which Badger can not read

We observe which key which cannot be read, then delete that value (Txn.Delete). The deletion succeeds without any error.

After restarting the instance and reading the next value, Badger successfully returns the value. But the decoding step fails, indicating that the data returned is corrupted. The decoding error is below for reference (though this error occurs outside of Badger):

could not decode entity: msgpack: invalid code=6 decoding map length

Expected Behaviour

Require truncating the database (remove any values which cannot be read), then successfully read all values which were not truncated.

Specs

manishrjain commented 2 years ago

Is this value log file there? 2587. And what is it's size?

SyncWrites is set to false by default, relying upon mmap. Mmap does well for process level crashes, but didn't do well if there are filesystem level issues.

Also -- how big are your values? You could also consider turning value log off. The SSTables are always synced to disk, irrespective.

jordanschalm commented 2 years ago

The value log file is there, it's 418MB.

$ du -h ./002587.vlog
418M    ./002587.vlog

Mmap does well for process level crashes, but didn't do well if there are filesystem level issues.

That's good to know. Is it necessary to use SyncWrites to avoid this kind of outcome? What kind of impact should we expect from mmap not handling filesystem level issues well?

how big are your values? You could also consider turning value log off. The SSTables are always synced to disk, irrespective.

Most values are fairly small (~1MB or smaller), but we have some very large values (up to 10GB). Looking into getting more specifics here.

manishrjain commented 2 years ago

Yeah, having SyncWrites would avoid this kind of outcome for sure -- though, it would come at some write performance cost.

You could also look into decreasing the usage of value log. See https://github.com/outcaste-io/badger/blob/6bfcd5e451a0ed9c65d9467f219164a58acf4e97/options.go#L524

This would allow you to set the value threshold to something really high and allow Badger to set it automatically to ensure only x%-ile of values are going into value log.

Outserv, for example, does not use value log at all -- we don't have big values, so we skip it altogether.

manishrjain commented 2 years ago

Closing due to no activity.