milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.51k stars 2.83k forks source link

[Bug]:[benchmark][standalone]Milvus recall was calculated incorrectly and the error was reported as" Length of returned topk is 0" #19314

Closed jingkl closed 1 year ago

jingkl commented 1 year ago

Is there an existing issue for this?

Environment

- Milvus version:master-20220920-5141e05c
- Deployment mode(standalone or cluster):standalone
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus 2.2.0dev32
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

server-instance fouram-cron-1663689600-4 server-configmap server-single-8c16m client-configmap client-acc-sift-hnsw

NAME                                                              READY   STATUS      RESTARTS      AGE     IP             NODE
    NOMINATED NODE   READINESS GATES
fouram-cron-1663689600-4-etcd-0                                   1/1     Running     0             114s    10.104.1.104   4am-node1
0   <none>           <none>
fouram-cron-1663689600-4-milvus-standalone-cb98b469c-zxddt        1/1     Running     0             114s    10.104.5.154   4am-node1
2   <none>           <none>
fouram-cron-1663689600-4-minio-74c49dbff4-kx2wq                   1/1     Running     0             114s    10.104.1.92    4am-node1
0   <none>           <none>
2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,871] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,872] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,872] [   ERROR] - [get_recall_value] Length of returned topk is 0, please check. (milvus_benchmark.runners.utils:243)
[2022-09-20 16:11:39,872] [   ERROR] - Traceback (most recent call last):
  File "main.py", line 95, in run_suite
    result = runner.run_case(case_metric, **case)
  File "/src/milvus_benchmark/runners/accuracy.py", line 299, in run_case
    acc_value = utils.get_recall_value(true_ids[:nq, :top_k].tolist(), result_ids)
  File "/src/milvus_benchmark/runners/utils.py", line 245, in get_recall_value
    raise ValueError("[get_recall_value] The result of topk is wrong, please check: {}".format(result_ids))
ValueError: [get_recall_value] The result of topk is wrong, please check: [[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [].....]

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

client-acc-sift-hnsw

{
    "config.yaml": "ann_accuracy:
          collections:
            -
              milvus:
                cache_config.cpu_cache_capacity: 16GB
                engine_config.use_blas_threshold: 1100
              source_file: /test/milvus/ann_hdf5/sift-128-euclidean.hdf5
              collection_name: sift_128_euclidean
              index_types: ['hnsw']
              index_params:
                M: [16]
                efConstruction: [500]
              top_ks: [10]
              nqs: [10000]
              search_params:
                ef: [16, 32, 64, 128, 256, 512]
        "
}
jingkl commented 1 year ago

Ivfsq8's recall is down only 49%, before recall was at 97%, I think it's the same problem as above

server-instance fouram-cron-1663689600-3 server-configmap server-single-8c16m client-configmap client-acc-sift-ivf-sq8

NAME                                                              READY   STATUS            RESTARTS      AGE     IP             NOD
E         NOMINATED NODE   READINESS GATES
fouram-cron-1663689600-3-etcd-0                                   1/1     Running           0             2m11s   10.104.9.240   4am
-node14   <none>           <none>
fouram-cron-1663689600-3-milvus-standalone-7db876dfc6-2fdk7       1/1     Running           0             2m10s   10.104.9.237   4am
-node14   <none>           <none>
fouram-cron-1663689600-3-minio-77fc748bbc-s9vcq                   1/1     Running           0             2m11s   10.104.9.224   4am
-node14   <none>           <none>
[2022-09-20 16:39:51,565] [   DEBUG] - Milvus query run in 16.8105s (milvus_benchmark.client:57)
[2022-09-20 16:40:08,555] [   DEBUG] - Milvus query run in 16.9845s (milvus_benchmark.client:57)
[2022-09-20 16:40:25,585] [   DEBUG] - Milvus query run in 17.0241s (milvus_benchmark.client:57)
[2022-09-20 16:40:42,530] [   DEBUG] - Milvus query run in 16.9365s (milvus_benchmark.client:57)
[2022-09-20 16:40:59,642] [   DEBUG] - Milvus query run in 17.1075s (milvus_benchmark.client:57)
[2022-09-20 16:41:16,477] [   DEBUG] - Milvus query run in 16.8299s (milvus_benchmark.client:57)
[2022-09-20 16:41:33,495] [   DEBUG] - Milvus query run in 17.0136s (milvus_benchmark.client:57)
[2022-09-20 16:41:50,508] [   DEBUG] - Milvus query run in 17.0094s (milvus_benchmark.client:57)
[2022-09-20 16:42:07,524] [   DEBUG] - Milvus query run in 17.0103s (milvus_benchmark.client:57)
[2022-09-20 16:42:08,072] [    INFO] - {'acc': 0.499, 'search_rps': 17.01031994819641, 'rps_pv': 1.701031994819641} (milvus_benchmar
k.main:99)
[2022-09-20 16:42:08,073] [   DEBUG] - {'type': 'ann_accuracy', 'value': {'acc': 0.499, 'search_rps': 17.01031994819641, 'rps_pv': 1
.701031994819641}} (milvus_benchmark.main:107)
[2022-09-20 16:42:08,073] [   DEBUG] - {'_version': '0.1', '_type': 'case', 'run_id': 1663699017, 'mode': 'local', 'server': <milvus
_benchmark.metrics.models.server.Server object at 0x7f5f588bb1f0>, 'hardware': <milvus_benchmark.metrics.models.hardware.Hardware ob
ject at 0x7f5f588bb130>, 'env': <milvus_benchmark.metrics.models.env.Env object at 0x7f5f588bb0a0>, 'status': 'RUN_SUCC', 'err_messa
ge': '', 'collection': {'dimension': 128, 'metric_type': 'l2', 'dataset_name': 'sift_128_euclidean', 'shards_num': None}, 'index': {
'index_type': 'ivf_sq8', 'index_param': {'nlist': 1024}}, 'search': {'nq': 10000, 'topk': 10, 'search_param': {'nprobe': 512}, 'filt
er': [], 'guarantee_timestamp': None}, 'run_params': None, 'metrics': {'type': 'ann_accuracy', 'value': {'acc': 0.499, 'search_rps':
 17.01031994819641, 'rps_pv': 1.701031994819641}}, 'datetime': '2022-09-20 16:04:10.797146', 'type': 'metric'} (milvus_benchmark.met
ric.api:30)
[2022-09-20 16:42:08,084] [   DEBUG] - {'_version': '0.1', '_type': 'metric', 'run_id': 1663699017, 'mode': 'local', 'server': <milv
us_benchmark.metrics.models.server.Server object at 0x7f5f5b7ae9d0>, 'hardware': <milvus_benchmark.metrics.models.hardware.Hardware
object at 0x7f5f5a1721f0>, 'env': <milvus_benchmark.metrics.models.env.Env object at 0x7f5f5a172190>, 'status': 'RUN_SUCC', 'err_mes
sage': '', 'collection': {'dimension': 128, 'metric_type': 'l2', 'dataset_name': 'sift_128_euclidean', 'shards_num': None}, 'index':
 {}, 'search': None, 'run_params': None, 'metrics': {'type': 'ann_accuracy', 'value': {}}, 'datetime': '2022-09-20 16:04:10.797146',
 'type': 'metric'} (milvus_benchmark.metric.api:30)
[2022-09-20 16:42:08,130] [    INFO] - All tests run finshed (milvus_benchmark.main:277)
yanliang567 commented 1 year ago

/assign @cydrain /unassign

jingkl commented 1 year ago

This issue does not appear at the moment, so close the issue first