opensearch-project / opensearch-benchmark-workloads

Official workloads used by OpenSearch Benchmark (OSB)
https://opensearch.org/docs/latest/benchmark/
16 stars 61 forks source link

[BUG] Poor recall if search clients is greater than 40 for cohere-10m workload in vector search #347

Open VijayanB opened 1 month ago

VijayanB commented 1 month ago

What is the bug?

When executing vector search workload with large number of search clients ( if each clients gets < 5% of queries ), recall is very poor. This is not problem with vector search algorithm since for same dataset recall is 0.9 if search client is substantially lesser.

How can one reproduce the bug?

Execute 10m corpus vector search workload with search clients > 40

What is the expected behavior?

Recall should not be impacted

What is your host/environment?

N/A

Do you have any screenshots?

N/A

Do you have any additional context?

N/A

layavadi commented 1 month ago

up to 15 clients recall values are all non zero. Beyond 18 Clients 0 values start to get populated . This is 3 nodes.

IanHoang commented 1 month ago

There's a proposal to look into index / search clients scaling in OSB (more info can be found here).

What's the load generation host configuration?

layavadi commented 1 month ago

16 vcpu and 64 G memory. From the CPU utilisation with 40 client on the load generator was less than 40%

IanHoang commented 1 month ago

up to 15 clients recall values are all non zero. Beyond 18 Clients 0 values start to get populated . This is 3 nodes.

When you mention 3 nodes, are you saying that there are three LG Hosts or a single load generation host running OSB against a 3 node cluster?

layavadi commented 1 month ago

Single load generator with 3 cluster nodes

On Thu, 25 Jul, 2024, 22:44 Ian Hoang, @.***> wrote:

up to 15 clients recall values are all non zero. Beyond 18 Clients 0 values start to get populated . This is 3 nodes. When you mention 3 nodes, are you saying that there are three LG Hosts or a single load generation host running OSB against a 3 node cluster?

— Reply to this email directly, view it on GitHub https://github.com/opensearch-project/opensearch-benchmark-workloads/issues/347#issuecomment-2251011343, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADZK35FDW6ILEAHL6G4O7NLZOEW6BAVCNFSM6AAAAABLDJ24EGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRGAYTCMZUGM . You are receiving this because you commented.Message ID: <opensearch-project/opensearch-benchmark-workloads/issues/347/2251011343@ github.com>

IanHoang commented 1 week ago

@layavadi To help with the investigation, could you attach some charts associated with the tests you have been running. It'd be good to include three charts: