Open sachinpkale opened 5 months ago
Thanks @sachinpkale I think there is definitely some much needed refactoring here.
How does a new Directory implementation here solve the issue of multiple Stores in IndexShard. Are you planning to move the remote store specific functionality currently in IndexShard to this new directory? ex. syncSegmentsFromRemoteSegmentStore
? Where would these fit with directory API?
From what I can tell the only use of the remoteStore from within IndexShard is to fetch its remoteDirectory instance and do things with it. We could maybe extend the existing (somewhat already bloated) public API in store?
How does the OpenSearchDirectory fit with the plans for a CompositeDirectory? The injection of a cache & other dir implementations sounds very similar to the intent there.
How does a new Directory implementation here solve the issue of multiple Stores in IndexShard
With new directory, we don't need multiple stores in IndexShard. One store instance that contains OpenSearchDirectory
instance.
Are you planning to move the remote store specific functionality currently in IndexShard to this new directory? ex. syncSegmentsFromRemoteSegmentStore? Where would these fit with directory API?
Yes, syncSegmentsFromRemoteSegmentStore
as well as logic in RemoteStoreRefreshListener
will move to new directory. (answering second question at the last)
From what I can tell the only use of the remoteStore from within IndexShard is to fetch its remoteDirectory instance and do things with it. We could maybe extend the existing (somewhat already bloated) public API in store?
We can do this but the tight coupling of core and storage remains. For example, in replica promotion or recovery flows, we have if conditions added at multiple places to check if the index is remote store enabled or not. With the new directory, we want to remove these checks as well. So, core's interaction with segment storage does not change with or without remote store.
How does the OpenSearchDirectory fit with the plans for a CompositeDirectory? The injection of a cache & other dir implementations sounds very similar to the intent there.
Yes, it is similar on pattern where it contains 2 directories but it doesn't handle sync between cache and storage. Also, it does not provide any extensions to cache so only one type of cache can be present. But CompositeDirectory
itself can be evolved into OpenSearchDirectory
by adding right set of abstractions.
Where would remote store specific functionality fit with directory API?
We don't have clear answers to it but this is what I have thought about:
Cache to Storage Sync options:
sync
method from directory to trigger segment upload but sync is called only at flush and not on refresh. So, we need to make changes to core to call Directory.sync()
post each refresh. But sync also handles fsync of actual segment files that we don't want to trigger on each refresh. So, OpenSearchDirectory needs to be smart enough to not call fsync if both cache and storage are configured (this logic still needs some more thinking)OpenSearchDirectory
, called syncCacheToStorage
(or something similar). This will be called from core post each refresh. We don't have to add if-else in core to call this. Irrespective of use-case: DocRep, Remote backed storage, Writeable warm, this method will be called and if cache is not defined, it will be a no-op.Storage to Cache Sync:
init()
which will be called whenever we need to pull data from source of truth. Examples are replication flow, failover flow, recovery flow etc.With OpenSearchDirectory, we are exploring the feasibility of OpenSearch core interacting with only one interface: OpenSearchDirectory
and all the existing and future use cases around segment storage would be encapsulated within OpenSearchDirectory
interface. Even though this issue talks about remote store based use cases, OpenSearchDirectory can be used where remote store is not used.
Thanks Sachin for the writeup. I definitely agree on keeping a single composite store object which a single directory.
How does the OpenSearchDirectory fit with the plans for a CompositeDirectory? The injection of a cache & other dir implementations sounds very similar to the intent there.
Yes, it is similar on pattern where it contains 2 directories but it doesn't handle sync between cache and storage. Also, it does not provide any extensions to cache so only one type of cache can be present. But
CompositeDirectory
itself can be evolved intoOpenSearchDirectory
by adding right set of abstractions.
Yes, That was the intent behind the current design for CompositeDirectory. Though it doesn't handle cache & storage sync it can definitely be evolved in this direction.
Storage to Cache Sync:
- There could be a new method introduced in OpenSearchDirectory, say
init()
which will be called whenever we need to pull data from source of truth. Examples are replication flow, failover flow, recovery flow etc.
Just to add, I think OpenSearchDirectory
should also be able to operate in Read only v/s write mode(or master/slave) for primary & replicas. Directionality of sync can be depending on how directory is configured. And we should be able to flip the behavior at runtime.
@Thanks @ankitkala for reviewing.
I think OpenSearchDirectory should also be able to operate in Read only v/s write mode(or master/slave) for primary & replicas. Directionality of sync can be depending on how directory is configured. And we should be able to flip the behavior at runtime.
I am not sure if I understand it completely. Directory should not be knowing if it is a part of primary or replica, right?
@thanks @ankitkala for reviewing.
I think OpenSearchDirectory should also be able to operate in Read only v/s write mode(or master/slave) for primary & replicas. Directionality of sync can be depending on how directory is configured. And we should be able to flip the behavior at runtime.
I am not sure if I understand it completely. Directory should not be knowing if it is a part of primary or replica, right?
Correct. I just meant that directory should still be able to distinguish whether the remote is writable or not (i.e. replica's directory shouldn't be able to write).
Is your feature request related to a problem? Please describe
RemoteDirectory
abstraction was created to upload and download segments to and from the configured remote store. This abstraction was created to make it consistent with existing Directory interface that Lucene uses for segment operations (create, read, delete). But the RemoteDirectory abstraction is incomplete as it has no knowledge of FsDirectory implementation that handles segment operations for local store.IndexShard
contains two instances of Store:store
which contains FsDirectory instance for local disk andremoteStore
which contains RemoteDirectory instance for remote store. Except the common IndexShard parent, store and remoteStore do not know anything about each other. Sync between these two stores is scattered across various code flows. This makes the entire abstraction leaky and error-prone. As we plan to add more features on top of remote backed storage (1. Searchable Remote Index 2. Writeable Warm), we need to come up with stronger directory abstraction to avoid non-maintainable code.Describe the solution you'd like
Ideally,
Store
should encapsulate all the segment storage related constructs and corresponding syncs between these constructs. For the other operations like indexing or search, storage should be seen as a black box and can be accessed with provided interface. This also aligns with broader modularity vision with the next step of abstracting out storage as a separate module. For this RFC, we limit the discussion on segment storage abstraction only.We propose to provide all the segment storage abstractions in the form of Directory. We call it
OpenSearchDirectory
. OpenSearchDirectory will have two components:On top of existing Directory interface, the OpenSearchDirectory implementation will provide further abstractions like stats around remote store interaction, whether a segment file present in cache or storage or both etc. The actual sync between cache and storage would be hidden from the core. The OpenSearch core should not be aware whether a given segment is getting served from a cache or from storage as long as the right directory implementations for cache and storage are used.
Based on the use case, the role of cache and storage can be changed. For some known use cases, we can define cache and storage components as given below:
Related component
Storage:Remote
Describe alternatives you've considered
There were few alternative approaches proposed mostly to tackle the same problem:
Additional context
ToDo: