milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.36k stars 2.91k forks source link

[Bug]: Milvus restarted when searching with nb=1000000 after creating index #19682

Closed NicoYuan1986 closed 2 years ago

NicoYuan1986 commented 2 years ago

Is there an existing issue for this?

Environment

- Milvus version:85e04d84
- Deployment mode(standalone or cluster):standalone
- SDK version(e.g. pymilvus v2.0.0rc2):2.2.0.dev33
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

When searching with nb=1000000 after creating index, milvus will restart unexpectedly. I don't know if it's related to the index, because I search successfully without index.

[search] retry:4, cost: 0.27s, reason: <_MultiThreadedRendezvous: StatusCode.UNAVAILABLE, failed to connect to all addresses>
[search] retry:5, cost: 0.81s, reason: <_MultiThreadedRendezvous: StatusCode.UNAVAILABLE, failed to connect to all addresses>
[search] retry:6, cost: 2.43s, reason: <_MultiThreadedRendezvous: StatusCode.UNAVAILABLE, failed to connect to all addresses>
[search] retry:7, cost: 7.290000000000001s, reason: <_MultiThreadedRendezvous: StatusCode.UNAVAILABLE, failed to connect to all addresses>
[search] retry:8, cost: 21.870000000000005s, reason: <_MultiThreadedRendezvous: StatusCode.UNAVAILABLE, failed to connect to all addresses>
[search] retry:9, cost: 60s, reason: <_MultiThreadedRendezvous: StatusCode.UNAVAILABLE, failed to connect to all addresses>
[search] retry:10, cost: 60s, reason: <_MultiThreadedRendezvous: StatusCode.UNAVAILABLE, failed to connect to all addresses>

Expected Behavior

Search successfully

Steps To Reproduce

import h5py
dataset = h5py.File("sift-128-euclidean.hdf5")
from pymilvus import CollectionSchema, FieldSchema
from pymilvus import Collection
from pymilvus import connections
from pymilvus import DataType
from pymilvus import Partition
from pymilvus import utility
connections.connect(host="10.100.31.101", port="19530")
import numpy as np
import time
nq = 1
limit = 10
int64_field = FieldSchema(name="int64", dtype=DataType.INT64, is_primary=True)
float_vector = FieldSchema(name="float_vector", dtype=DataType.FLOAT_VECTOR, dim=128)
schema = CollectionSchema(fields=[int64_field, float_vector])
collection = Collection("test_search_1", schema=schema)
res = collection.insert([[i for i in range(1000000)], dataset["train"]])
index_param = {"index_type": "IVF_FLAT", "metric_type": "L2", "params": {"nlist": 100}}
collection.create_index("float_vector", index_param)
collection.load()
default_search_params = {"metric_type": "L2", "params": {"nprobe": 10}, "offset": 100}
time1 = []
for i in range(200):
  t1 = time.time()
  res1 = collection.search(dataset["test"][:nq], "float_vector", default_search_params, limit, "int64 >= 0")
  tt1 = time.time() - t1
  time1.append(tt1)

Milvus Log

test.log

Anything else?

No response

yanliang567 commented 2 years ago

/assign @soothing-rain /unassign

soothing-rain commented 2 years ago

@NicoYuan1986 There's no error in the log. Can you double check if Milvus restarted from OOM?

xiaofan-luan commented 2 years ago

@soothing-rain milvus should have a limitation that nq < 16,384, could you check why this does not take effect?

soothing-rain commented 2 years ago

@soothing-rain milvus should have a limitation that nq < 16,384, could you check why this does not take effect?

Turns out there's a limit on topK but not on nq.

For example we can do:

hello_milvus.search(vectors_to_search[:1000000], "embeddings", search_params, limit=16384, output_fields=["random"])
xiaofan-luan commented 2 years ago

@soothing-rain milvus should have a limitation that nq < 16,384, could you check why this does not take effect?

Turns out there's a limit on topK but not on nq.

For example we can do:

hello_milvus.search(vectors_to_search[:1000000], "embeddings", search_params, limit=16384, output_fields=["random"])

in https://milvus.io/docs/limitations.md there is a limit. If not we should add one

soothing-rain commented 2 years ago

/unassign /assign @NicoYuan1986

Please verify.

NicoYuan1986 commented 2 years ago

Solved! Thank you so much!