redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.33k stars 574 forks source link

cloud_storage: simplify topic tiered storage enablement into one on/off toggle (remove legacy "archival" mode) #6629

Open jcsp opened 1 year ago

jcsp commented 1 year ago

Archival pre-dates the general tiered storage feature in Redpanda: it is the write path only, writing a stream of log segments into S3.

Currently there are independent read + write properties for S3 access on each topic.

remote.read=true,remote.write=true is an easy state to explain for a topic: it means we'll use tiered storage, spilling some data into S3 and sending reads there when necessary. Both properties false also clearly makes sense. Anything else is hard to explain and reason about, including the "archival" mode where remote.write is true but remote.read is not.

We should collapse the overall property for controlling use of tiered storage into a single true/false, and not have this individual toggling of the read & write paths.

If we wanted to explicitly support the use case of writing segment to S3 and never deleting them, but not considering those segments in S3 to be part of the readable partition range (e.g. for someone that wants an unbounded-size audit trail of a partition), we could add that back in as a first class feature that would basically disable deleting segments when they fall out of the retention period.

For administrative/support situations where we might want to e.g. disable writes temporarily, we need more structured & tested paths for doing so:

JIRA Link: CORE-1036

mmedenjak commented 1 year ago

Related - https://github.com/redpanda-data/documentation/issues/596