reverbrain / eblob

Eblob is an append-only low-level IO library, which saves data in blob files. Created as low-level backend for elliptics
GNU Lesser General Public License v3.0
104 stars 29 forks source link

Fix #157

Closed shaitan closed 8 years ago

shaitan commented 8 years ago

Fixed removed records size calculation - now it includes correct size from indexes and blobs. Added sanity check to eblob_mark_entry_remove - it checks that on-disk and in-memory keys are the same.

bioothod commented 8 years ago

That doesn't look right, disk_size should include header/footer sizes already

    /* total size this record occupies on disk.
     * It includes alignment and header/footer sizes.
     * This structure is header.
     */
    uint64_t        disk_size;
shaitan commented 8 years ago

disk_size doesn't include size of the header from index file.

bioothod commented 8 years ago

But why do you want to include this parameter into removed size statistics? Can the same goal be achieved by summing up current removed size and sizeof(eblob_disk_control) * number of removed records?

shaitan commented 8 years ago

We have follow statistics: records_total - total number of stored records base_size - total size of all blobs and indexes

To be consistent: records_removed - total number of stored removed records records_removed_size - total size occupied by removed records in blobs and indexes

bioothod commented 8 years ago

Makes sence. Please update all comments and docs to reflect the fact, that all sizes include both data and index structures.

Also, while you are at it, can we try to read header from data fd in eblob_read() and friends and remove entry from index, if data mismatches? This is needed for cases when only index file has removed entry and on-disk header doesn't have 'removed' mark. Looks like this is the reason for major discrepancy in iterator data and blob statistics.

shaitan commented 8 years ago

I've added validation of headers from index and data files into eblob_fill_write_control_from_ram, but I've disabled removing such keys - let's find out such keys and debug this validation, because if it starts to remove them we have a chance to remove valid records.

Also I've fixed dropping BLOB_DISK_CTL_UNCOMMITTED flags while eblob_plain_writev_prepare - it fixes errors like eblob_stress: write has been failed: key-236: flags: none, error: 1.