jina-ai / faq

Apache License 2.0
2 stars 0 forks source link

What's the difference between shards and replicas? When should I use each? #15

Open alt-shreya opened 2 years ago

alt-shreya commented 2 years ago

A replica is an exact copy. The goal in replication is to have the same data set on both Primary and slave.

Sharding, on the other hand, is segmentation. The data set is broken into shards and kept in different nodes. This division of data is performed based on different algorithms.

In Jina we support two ways of scaling: -Replicas can be used with any Executor type and is typically used for performance and availability. -Shards are used for partitioning data and should only be used with Indexers since they store a state.

~ answered by Anoushka Jha