milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.02k stars 2.79k forks source link

[Bug]: No L0 compaction tasks completed #34460

Closed ThreadDao closed 1 month ago

ThreadDao commented 1 month ago

Is there an existing issue for this?

Environment

- Milvus version: 2.4-20240705-261b61e8
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

  1. create collection with 2 shards, enable partition_key with 32 num_partitions
  2. create index
  3. insert 30m-128d data and flush
  4. index and load
  5. concurrent requests: search + upsert + flush image
  6. No L0 compaction tasks completed from the grafana monitor metrics of level-zero-par-key-op-2-7900 image image

By the way, import metrics no data !!

Expected Behavior

No response

Steps To Reproduce

https://argo-workflows.zilliz.cc/archived-workflows/qa/8c535d6e-68ea-4dc3-ac49-13772107181f?nodeId=level-zero-stable-1720105200-try-2331900307

Milvus Log

pods:

level-zero-par-key-op-2-7900-etcd-0                               1/1     Running            0              12h     10.104.17.197   4am-node23   <none>           <none>
level-zero-par-key-op-2-7900-etcd-1                               1/1     Running            0              12h     10.104.20.18    4am-node22   <none>           <none>
level-zero-par-key-op-2-7900-etcd-2                               1/1     Running            0              12h     10.104.16.224   4am-node21   <none>           <none>
level-zero-par-key-op-2-7900-milvus-datanode-57d586fcf4-ggscp     1/1     Running            0              12h     10.104.20.45    4am-node22   <none>           <none>
level-zero-par-key-op-2-7900-milvus-datanode-57d586fcf4-j8hth     1/1     Running            0              12h     10.104.32.112   4am-node39   <none>           <none>
level-zero-par-key-op-2-7900-milvus-indexnode-b64979c56-qtgn4     1/1     Running            0              12h     10.104.5.45     4am-node12   <none>           <none>
level-zero-par-key-op-2-7900-milvus-indexnode-b64979c56-sqchw     1/1     Running            0              12h     10.104.6.74     4am-node13   <none>           <none>
level-zero-par-key-op-2-7900-milvus-mixcoord-6795788c48-wpj48     1/1     Running            0              12h     10.104.17.223   4am-node23   <none>           <none>
level-zero-par-key-op-2-7900-milvus-proxy-7fd59c66d5-g4zg9        1/1     Running            0              12h     10.104.26.210   4am-node32   <none>           <none>
level-zero-par-key-op-2-7900-milvus-querynode-0-8479f96668b9bdn   1/1     Running            0              12h     10.104.15.75    4am-node20   <none>           <none>
level-zero-par-key-op-2-7900-milvus-querynode-0-8479f96668hl8j7   1/1     Running            0              12h     10.104.21.140   4am-node24   <none>           <none>
level-zero-par-key-op-2-7900-milvus-querynode-0-8479f96668rl89n   1/1     Running            0              12h     10.104.26.211   4am-node32   <none>           <none>
level-zero-par-key-op-2-7900-milvus-querynode-0-8479f96668xvb8x   1/1     Running            0              12h     10.104.19.10    4am-node28   <none>           <none>
level-zero-par-key-op-2-7900-minio-0                              1/1     Running            0              12h     10.104.20.20    4am-node22   <none>           <none>
level-zero-par-key-op-2-7900-minio-1                              1/1     Running            0              12h     10.104.17.196   4am-node23   <none>           <none>
level-zero-par-key-op-2-7900-minio-2                              1/1     Running            0              12h     10.104.16.225   4am-node21   <none>           <none>
level-zero-par-key-op-2-7900-minio-3                              1/1     Running            0              12h     10.104.18.252   4am-node25   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-bookie-0                      1/1     Running            0              12h     10.104.20.21    4am-node22   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-bookie-1                      1/1     Running            0              12h     10.104.17.198   4am-node23   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-bookie-2                      1/1     Running            0              12h     10.104.16.229   4am-node21   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-bookie-init-mpzp2             0/1     Completed          0              12h     10.104.1.237    4am-node10   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-broker-0                      1/1     Running            0              12h     10.104.1.238    4am-node10   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-proxy-0                       1/1     Running            0              12h     10.104.17.192   4am-node23   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-pulsar-init-lfsfm             0/1     Completed          0              12h     10.104.20.12    4am-node22   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-recovery-0                    1/1     Running            0              12h     10.104.14.216   4am-node18   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-zookeeper-0                   1/1     Running            0              12h     10.104.20.19    4am-node22   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-zookeeper-1                   1/1     Running            0              12h     10.104.17.200   4am-node23   <none>           <none>
level-zero-par-key-op-2-7900-pulsar-zookeeper-2                   1/1     Running            0              12h     10.104.16.235   4am-node21   <none>           <none>

Anything else?

No response

ThreadDao commented 1 month ago

Other issue instance: level-zero-insert-op-40-4759 image

yanliang567 commented 1 month ago

/assign @XuanYang-cn /unassign

XuanYang-cn commented 1 month ago

/assign @ThreadDao /unassign Please help verify

XuanYang-cn commented 1 month ago

/assign image

ThreadDao commented 1 month ago

fixed 2.4-20240711-1d2062a6-amd64

ThreadDao commented 1 month ago

@XuanYang-cn It seems that there is still a problem with the delete in the upsert scenario image: 2.4-20240711-86b57b78-amd64 metrics of level-zero-upsert-op-53-4572 image

czs007 commented 1 month ago

@ThreadDao please verify use latest 2.4 branch

ThreadDao commented 1 month ago

fixed 2.4-20240716-dfb41582-amd64