Open jcsp opened 1 year ago
There's a functional draft of updating the scrubber to clean up orphan segments here: https://github.com/redpanda-data/redpanda/tree/orphan-cleanup
We should ensure this can be disabled, for customers that prefer to have their buckets immutable.
By design, Redpanda will sometimes leave orphan objects in its object storage bucket. This happens when a node writes a segment, but then unexpectedly loses leadership before it can update the manifest. We do our best to avoid it (https://github.com/redpanda-data/redpanda/pull/8560) but it will happen from time to time.
Like any storage system, to ensure good data hygiene over long storage periods, Redpanda needs a data scrubbing feature. This can be more or less extensive depending on the needs of a given system:
The extreme scrubbing is probably only useful on less-trusted object stores (e.g. if someone uses minio with its basic filesystem backend) -- there is less value in scrubbing a more highly trusted backend like AWS S3.
JIRA Link: CORE-1177