openvstorage / volumedriver

The Open vStorage VolumeDriver is the core of the Open vStorage solution: a high performance distributed block layer. It converts block storage into objects (Storage Container Objects).
Other
37 stars 18 forks source link

Overzealous full MDS slave rebuilds due to absent scrub ID #353

Closed redlicha closed 6 years ago

redlicha commented 7 years ago
2017-09-08 09:15:16 730885 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/MetaDataServerTable - 000000000000a8ff - info - catch_up: bdcecc28-23ae-4bd4-851b-36b1454a8f8f: request to catch up; dry run: DryRun::F, check scrub ID: CheckScrubId::T
2017-09-08 09:15:16 730975 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/MDSMetaDataBackend - 000000000000a900 - info - MDSMetaDataBackend: bdcecc28-23ae-4bd4-851b-36b1454a8f8f
2017-09-08 09:15:16 731041 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/MDSMetaDataBackend - 000000000000a901 - info - init_: bdcecc28-23ae-4bd4-851b-36b1454a8f8f: used clusters: 64037556
2017-09-08 09:15:16 731124 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/MDSMetaDataBackend - 000000000000a902 - info - lastCorkUUID: bdcecc28-23ae-4bd4-851b-36b1454a8f8f:  dba03763-72c2-4dfc-9c45-5e18ed2278c5
2017-09-08 09:15:16 731158 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/MDSMetaDataBackend - 000000000000a903 - info - scrub_id: bdcecc28-23ae-4bd4-851b-36b1454a8f8f: scrub ID --
2017-09-08 09:15:16 731197 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/CachedMetaDataStore - 000000000000a904 - info - init_pages_: bdcecc28-23ae-4bd4-851b-36b1454a8f8f: page capacity (entries): 32, max cached pages: 256
2017-09-08 09:15:16 731313 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/MetaDataStoreBuilder - 000000000000a905 - info - update_metadata_store_: bdcecc28-23ae-4bd4-851b-36b1454a8f8f: bringing MetaDataStore in sync with backend, requested interval ( dba03763-72c2-4dfc-9c45-5e18ed2278c5, --], check scrub ID: CheckScrubId::T, dry run:DryRun::F, full rebuild: false
2017-09-08 09:15:16 731413 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/BackendConnectionInterfaceLogger - 000000000000a906 - info - Logger: Entering read bdcecc28-23ae-4bd4-851b-36b1454a8f8f snapshots.xml
2017-09-08 09:15:16 849609 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/BackendConnectionInterfaceLogger - 000000000000a91c - info - ~Logger: Exiting read for bdcecc28-23ae-4bd4-851b-36b1454a8f8f snapshots.xml
2017-09-08 09:15:16 849663 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/BackendConnectionInterfaceLogger - 000000000000a91d - info - ~Logger: Exiting read for bdcecc28-23ae-4bd4-851b-36b1454a8f8f
2017-09-08 09:15:16 983400 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/MetaDataStoreBuilder - 000000000000a925 - info - update_metadata_store_: bdcecc28-23ae-4bd4-851b-36b1454a8f8f: no scrub ID found in local metadata store
2017-09-08 09:15:16 983412 -0400 - NY1SRV0008 - 34792/0x00007f7dce658700 - volumedriverfs/MetaDataStoreBuilder - 000000000000a926 - warning - update_metadata_store_: bdcecc28-23ae-4bd4-851b-36b1454a8f8f: old MDS slave detected: cork present but no scrub ID. Rebuild required.
redlicha commented 7 years ago

Related: code to force full rebuild / set scrub ID on initial build was introduced with #310

wimpers commented 7 years ago

@redlicha related to https://github.com/openvstorage/volumedriver/issues/148 ?

redlicha commented 7 years ago

Not related to #148 as it's been observed with volumes that are not clones.

wimpers commented 6 years ago

Whatis that scrub ID actually? when scrub results are applied, metadata in the MDS might be changed (point to new SCOs). MDS slaves also need to have these relocations applied. to make sure they don't point to outdated SCOs, the scrub ID is set on scrub result application. basically some UUID to make sure everyone's on the same page. on mismatch (absence) a full rebuild of the metadata is necessary. It's even supposed to be set on creation of an mds table (to some random UUID), to avoid special casing.

redlicha commented 6 years ago

One possible way to end up there:

While a reconfiguration should not be issued while there's still a rebuild going on (TBD: provide progress info!?), interruption before the ScrubId is set should be prevented as well.

CC @JeffreyDevloo

openvstorage-ci commented 6 years ago

This issue was moved to openvstorage/volumedriver-ee#94

wimpers commented 6 years ago

W ill only be fixed in EE version. Re-open if needed.