elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.26k stars 24.86k forks source link

Reduce DTS costs for cross zone data transfer within Elasticsearch #73501

Open dakrone opened 3 years ago

dakrone commented 3 years ago

DTS, or Data Transfer & Storage, is a high cost for users either running on ESS or their own cloud deployment. Users use forced awareness to maintain copies of indices on multiple availability zones for high availability. Elasticsearch currently does no special handling of data to reduce the amount of data transferred between zones.

This meta issue links to issues with potential ideas for mitigating the cost of DTS.

Sources of inter-AZ data transfer

Some of these sources are not directly transferring data, however, their execution leads to data being transferred between availability zones

Separate issues

elasticmachine commented 3 years ago

Pinging @elastic/es-distributed (Team:Distributed)

elasticmachine commented 3 years ago

Pinging @elastic/es-core-features (Team:Core/Features)

dakrone commented 3 years ago

https://github.com/elastic/elasticsearch/issues/73971 may also be related, dealing with DTS costs from async search.

seang-es commented 3 years ago

https://github.com/elastic/elasticsearch/issues/62194 related

DaveCTurner commented 2 years ago

I think the ability to relocate shards via snapshots has reduced or eliminated much of the DTS costs mentioned above. I'm therefore removing this from the distrib team area.

elasticsearchmachine commented 2 years ago

Pinging @elastic/es-data-management (Team:Data Management)