opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.43k stars 1.72k forks source link

Storage Roadmap #3739

Open andrross opened 2 years ago

andrross commented 2 years ago

This is a more specific plan expanding on the ideas introduced in the High Level Vision for Storage issue. The goal of this plan is to have a place to discuss the bigger picture as we design and implement the incremental features along the way.

Phase 1: Add Remote Storage Options for Improved Durability Users can enable continuous backup of translog documents and Lucene segments to remote storage, guaranteeing durability without replicas or periodic snapshots. No changes to search.

Phase 2: Searchable Snapshots Users can search snapshots in remote repositories without downloading all index data to disk ahead of time.

This phase implements the ability for an OpenSearch node to search indexes that are stored in remote storage as snapshots. It will leverage what Ultrawarm has built everywhere that is possible, but will be built natively into OpenSearch. It will fetch index data on-demand during searches and use a disk-based cache to improve performance. It will not assume that indexes are immutable (in preparation for Phase 4) or that shards have been merged to a single Lucene segment (in preparation for Phase 3).

This is intended as an incremental feature that can add value on the way towards implemented the more long term goals in phases 3 and 4.

Phase 3: Searchable Remote Index Users can search remote data from Phase 1 indexes without requiring all index data to be stored on disk.

This phase will allow users to remove index data from instance storage for remote storage-enabled indexes from Phase 1 while retaining the ability to search these indexes via the remote searching feature built in Phase 2. Until Phase 4 is implemented these indexes will be read-only. The work required here is to implement the functionality to migrate an index from a Phase 1 remote-enabled index to a (read only) searchable remote-only index, as well as adapting the snapshot-based functionality from Phase 2 to work with these types of indexes.

Phase 4: Writable Searchable Remote Index Users can write to searchable remote indexes created in Phase 3, without requiring index data to be stored on instance storage.

This phase requires extending the functionality built in Phase 1 to write to remote-only indexes as opposed to mirroring data on disk and in the remote store.

Phase 5: Cold Remote Index Users can create indexes where both index data and metadata are stored in remote storage. Indexes are searchable and writable, possibly only via an asynchronous API due to the amount of time needed to process requests.

This phase requires designing and building a new concept of creating index readers and writers on-demand from metadata and data located in a remote store.

Additional Work Streams

The following are additional features that are related to/enabled by remote storage.

Segment Replication w/ Remote Storage Segment replication is a feature under development to replicate Lucene segments from the primary node to replicas (as opposed to replicating the original document, which requires replicas to do the indexing as well). If the primary node is also replicating Lucene segments to remote storage (Phase 1) then replicas can pull that data from remote storage instead of copying from the primary. This architecture leverages the remote store for fanout during replication, eliminating a bottleneck on the primary when a large number of replicas exist.

Point in Time Restore Phase 1 enables continuous backup to remote storage, which means it is in theory possible to implement a feature to restore to nearly any point-in-time that was backed up along the way.

zehonghuang commented 2 years ago

I‘m so happy to see this Roadmap. I have been bothered for alternative Searchable Snapshots recently. 💯

zehonghuang commented 2 years ago

Hi, "Ultrawarm has built everywhere", what's its meaning? Will searchable snapshot only run on AWS?

andrross commented 2 years ago

Hi, "Ultrawarm has built everywhere", what's its meaning?

Much of the implementation of Ultrawarm will be applicable here, so that code will be ported over where appropriate.

Will searchable snapshot only run on AWS?

Definitely not. We intend to build this on top of the existing repository interface, so any object store for which there is a repository implementation will be supported (AWS, Azure, GCP, HDFS, etc).

zehonghuang commented 2 years ago

It's beneficial to reduce cost of storage. I will keep following this function. Thank you for replying to my question.

andrross commented 1 year ago

Quick update on the progress here:

Phase 1 has been released as an experimental feature in OpenSearch 2.3! We welcome any and all feedback as we work towards finalizing this new capability.

Phase 2 is in active development and we're targeting OpenSearch 2.4 to have the first experimental version of this feature.

jrj0823 commented 1 year ago

Hi, as primary shard may write index data/translog to remote store. If there is a brain split, old primary still accepts bulk requests from client. Files on remote storage may be written by multiple primary, will this situation happen? If so, is there any proposal to solve such a problem?

bugmakerrrrrr commented 1 year ago

hello, are there any plans to support searchable snapshots in ISM?

Bukhtawar commented 1 year ago

Hi, as primary shard may write index data/translog to remote store. If there is a brain split, old primary still accepts bulk requests from client. Files on remote storage may be written by multiple primary, will this situation happen? If so, is there any proposal to solve such a problem?

This wouldn't happen today since we ensure we talk to the replica copy to detect network partitions and isolated writers before writing to the remote store. You can read more on this #3706

kotwanikunal commented 1 year ago

hello, are there any plans to support searchable snapshots in ISM?

Hey @bugmakerrrrrr . Thanks for your interest in searchable snapshots. I have added in a feature request to the index management team here - https://github.com/opensearch-project/index-management/issues/808 Please track the issue for any updates and feel free to add any other requirements on the issue.

carlos-neto-trustly commented 1 year ago

I'm so excited and happy about this Storage Roadmap 👍 .

I read the Searchable Snapshot documentation section, and there is one question that is not to be clear to me:

Is it possible the Search nodes exceed the disk usage with data queried from remote storage of the snapshots index? What is behavior in this case? Will the cache be rotated based On Demand queries?

marsupialtail commented 10 months ago

Looks like this is where the tech discussion is taking place, so I am curious to know if there's any open source material (not code, just information) that we can read up on how Ultrawarm and Cold tier works. I just want to understand how it works to better size clusters. For example, are ultrawarm searches distributed across ultrawarm nodes? Will adding more nodes speed up the cold -> ultrawarm migration and ultrawarm searches? Please direct me to a more suitable place to ask this question if there is one, thanks!

andrross commented 10 months ago

@marsupialtail UltraWarm and Cold Tier are current AWS offerings, so AWS is the place to go for information about those features. This issue and the overall plan is about the features being built on top of the new remote store-based architecture.

yusizn commented 8 months ago

I'm so excited and happy about this Storage Roadmap 👍 .

I read the Searchable Snapshot documentation section, and there is one question that is not to be clear to me:

Is it possible the Search nodes exceed the disk usage with data queried from remote storage of the snapshots index? What is behavior in this case? Will the cache be rotated based On Demand queries?

Is there any new information on this?

andrross commented 8 months ago

I'm so excited and happy about this Storage Roadmap 👍 . I read the Searchable Snapshot documentation section, and there is one question that is not to be clear to me: Is it possible the Search nodes exceed the disk usage with data queried from remote storage of the snapshots index? What is behavior in this case? Will the cache be rotated based On Demand queries?

Is there any new information on this?

Yes, data is offloaded from local disks in that case, with the least recently used data being evicted first.

10000-ki commented 6 months ago

Hello, I'm very interested in this roadmap can i know the release schedule of Phase 3, 4, 5??

andrross commented 6 months ago

@10000-ki We are working through the planning and estimating now and will keep this issue up to date when we have better visibility into the schedule.