milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.1k stars 2.79k forks source link

Sometimes the client is frozen during loading. #6554

Closed del-zhenwu closed 3 years ago

del-zhenwu commented 3 years ago

Describe the bug:

Sometimes the client is frozen during loading operations, only sift1m dataset need to be loaded As the following monitor result shows, the server is not receive any data, and memory not ricing after 4min, but the client get no response in 20 mins.

[2021-07-15 07:51:04,724] [   DEBUG] - Milvus insert run in 1.14s (milvus_benchmark.client:49)
[2021-07-15 07:51:06,281] [   DEBUG] - Milvus insert run in 1.13s (milvus_benchmark.client:49)
[2021-07-15 07:51:07,987] [   DEBUG] - Milvus insert run in 1.14s (milvus_benchmark.client:49)
[2021-07-15 07:51:07,992] [   DEBUG] - End insert, start flush (milvus_benchmark.runners.accuracy:236)
[2021-07-15 07:51:10,522] [   DEBUG] - Milvus flush run in 2.53s (milvus_benchmark.client:49)
[2021-07-15 07:51:10,523] [   DEBUG] - End flush (milvus_benchmark.runners.accuracy:238)
[2021-07-15 07:51:10,525] [   DEBUG] - Row count: 1000000 in collection: <sift_128_euclidean> (milvus_benchmark.client:389)
[2021-07-15 07:51:10,525] [    INFO] - Table: sift_128_euclidean, row count: 1000000 (milvus_benchmark.runners.accuracy:240)
[2021-07-15 07:51:10,527] [    INFO] - None (milvus_benchmark.client:276)
[2021-07-15 07:51:10,528] [    INFO] - Drop index: sift_128_euclidean (milvus_benchmark.client:287)
[2021-07-15 07:51:10,530] [    INFO] - Re-create index: sift_128_euclidean (milvus_benchmark.runners.accuracy:245)
[2021-07-15 07:51:10,530] [    INFO] - Building index start, collection_name: sift_128_euclidean, index_type: IVF_FLAT, metric_type:
 L2 (milvus_benchmark.client:261)
[2021-07-15 07:51:10,530] [    INFO] - {'nlist': 1024} (milvus_benchmark.client:263)
[2021-07-15 07:51:49,378] [   DEBUG] - Milvus create_index run in 38.85s (milvus_benchmark.client:49)
[2021-07-15 07:51:49,379] [    INFO] - None (milvus_benchmark.client:276)
[2021-07-15 07:51:49,380] [    INFO] - {'index_type': 'flat', 'metric_type': None, 'index_param': None} (milvus_benchmark.runners.ac
curacy:247)
[2021-07-15 07:51:49,380] [    INFO] - Start load collection: sift_128_euclidean (milvus_benchmark.runners.accuracy:248)

Kubernetes-Compute-Resources-Pod-Grafana (9)

logs:

https://kibana-dev.zilliz.cc/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'2021-07-15T07:30:00.000Z',to:'2021-07-15T08:00:00.000Z'))&_a=(columns:!(log),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:ad1e1c50-c9d9-11eb-9e1a-b3dcad9693fd,key:kubernetes.pod_name,negate:!f,params:!(benchmark-2mdch-1-milvus-standalone-576ff4fc7b-ftz7q),type:phrases,value:benchmark-2mdch-1-milvus-standalone-576ff4fc7b-ftz7q),query:(bool:(minimum_should_match:1,should:!((match_phrase:(kubernetes.pod_name:benchmark-2mdch-1-milvus-standalone-576ff4fc7b-ftz7q))))))),hideChart:!f,index:ad1e1c50-c9d9-11eb-9e1a-b3dcad9693fd,interval:auto,query:(language:kuery,query:''),sort:!(!('@timestamp',desc)))

Expected behavior:

load operation will be finished in 3/4 min as expected

Steps/Code to reproduce:

  1. create collection
  2. insert 50,000 per request, insert total 1,000,000 entities (sift-1m)
  3. index with ivf_flat, nlist: 1024
  4. flush
  5. load

Environment:

Configuration file:

Additional context:

del-zhenwu commented 3 years ago

/kind bug

del-zhenwu commented 3 years ago

another issue: the given timeout set 3000s, but still no response after 1h passed

xige-16 commented 3 years ago

6908 #6935 has fix the load bug.

xiaofan-luan commented 3 years ago

duplicate,close for now