Closed jordanschalm closed 2 years ago
Is this value log file there? 2587. And what is it's size?
SyncWrites is set to false by default, relying upon mmap. Mmap does well for process level crashes, but didn't do well if there are filesystem level issues.
Also -- how big are your values? You could also consider turning value log off. The SSTables are always synced to disk, irrespective.
The value log file is there, it's 418MB.
$ du -h ./002587.vlog
418M ./002587.vlog
Mmap does well for process level crashes, but didn't do well if there are filesystem level issues.
That's good to know. Is it necessary to use SyncWrites
to avoid this kind of outcome? What kind of impact should we expect from mmap not handling filesystem level issues well?
how big are your values? You could also consider turning value log off. The SSTables are always synced to disk, irrespective.
Most values are fairly small (~1MB or smaller), but we have some very large values (up to 10GB). Looking into getting more specifics here.
Yeah, having SyncWrites would avoid this kind of outcome for sure -- though, it would come at some write performance cost.
You could also look into decreasing the usage of value log. See https://github.com/outcaste-io/badger/blob/6bfcd5e451a0ed9c65d9467f219164a58acf4e97/options.go#L524
This would allow you to set the value threshold to something really high and allow Badger to set it automatically to ensure only x%-ile of values are going into value log.
Outserv, for example, does not use value log at all -- we don't have big values, so we skip it altogether.
Closing due to no activity.
Context
The machine hosting our Badger Database instance ran out of disk space. After resizing the disk and restarting the instance, we are unable to read some data.
Our issue looks similar to this issue reported on the Dgraph Forum: https://discuss.dgraph.io/t/unable-to-read-some-data/11051
Initial problem
The disk is exhausted, we see errors when inserting new values like:
We resize the disk and restart the instance. The database opens successfully (does not require truncate). Upon reading recently written values required to initialize our application logic, we see Badger logs:
The read (
Txn.Get
) returns an error:Resolution Attempts
Truncate Database
After opening the database in truncate mode, we still see the same error above when reading some values.
Manually remove values which Badger can not read
We observe which key which cannot be read, then delete that value (
Txn.Delete
). The deletion succeeds without any error.After restarting the instance and reading the next value, Badger successfully returns the value. But the decoding step fails, indicating that the data returned is corrupted. The decoding error is below for reference (though this error occurs outside of Badger):
Expected Behaviour
Require truncating the database (remove any values which cannot be read), then successfully read all values which were not truncated.
Specs
github.com/dgraph-io/badger/v2 v2.2007.3