Closed shaitan closed 8 years ago
That doesn't look right, disk_size
should include header/footer sizes already
/* total size this record occupies on disk.
* It includes alignment and header/footer sizes.
* This structure is header.
*/
uint64_t disk_size;
disk_size
doesn't include size of the header from index file.
But why do you want to include this parameter into removed size
statistics?
Can the same goal be achieved by summing up current removed size
and sizeof(eblob_disk_control) * number of removed records
?
We have follow statistics:
records_total
- total number of stored records
base_size
- total size of all blobs and indexes
To be consistent:
records_removed
- total number of stored removed records
records_removed_size
- total size occupied by removed records in blobs and indexes
Makes sence. Please update all comments and docs to reflect the fact, that all sizes include both data and index structures.
Also, while you are at it, can we try to read header from data fd in eblob_read()
and friends and remove entry from index, if data mismatches? This is needed for cases when only index file has removed entry and on-disk header doesn't have 'removed' mark. Looks like this is the reason for major discrepancy in iterator data and blob statistics.
I've added validation of headers from index and data files into eblob_fill_write_control_from_ram
, but I've disabled removing such keys - let's find out such keys and debug this validation, because if it starts to remove them we have a chance to remove valid records.
Also I've fixed dropping BLOB_DISK_CTL_UNCOMMITTED
flags while eblob_plain_writev_prepare
- it fixes errors like eblob_stress: write has been failed: key-236: flags: none, error: 1
.
Fixed removed records size calculation - now it includes correct size from indexes and blobs. Added sanity check to
eblob_mark_entry_remove
- it checks that on-disk and in-memory keys are the same.