milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.9k stars 2.87k forks source link

[Bug]: After doing concurrent searching, inserting and deleting for a period of time, the count(*) results decreased #31739

Closed ThreadDao closed 6 months ago

ThreadDao commented 6 months ago

Is there an existing issue for this?

Environment

- Milvus version: cardinal-milvus-io-2.3-ab059bb-20240322
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka): pulsar   
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus 2.3.7rc7
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

  1. The count(*) results before test:

    utility.list_collections()
    ['laion_stable_4']
    c  =Collection(name='laion_stable_4')
    c.query('id >=0', output_fields=["count(*)"])
    [{'count(*)': 88117851}]
    c.query('id >= 139900000', output_fields=["count(*)"])
    [{'count(*)': 4467179}]
    c.query('id >= 140000000', output_fields=["count(*)"])
    [{'count(*)': 4466921}]
    c.query('id >= 145000000', output_fields=["count(*)"])
    [{'count(*)': 2662634}]
    c.query('id >= 146000000', output_fields=["count(*)"])
    [{'count(*)': 2226044}]
    c.query('id >= 150000000', output_fields=["count(*)"])
    [{'count(*)': 84721}]
    c.query('id >= 151000000', output_fields=["count(*)"])
    [{'count(*)': 0}]
    c.query('id >= 151000000', output_fields=["count(*)"])
    [{'count(*)': 3838705}]
  2. test reload collection -> concurrent: insert + delete + search each locust user insert 100 entities and each locust user delete 100 entities. Inserted id starts from 151000000 and increases sequentially; Delete id also starts from 151000000 and increases sequentially image

  3. the final result of locust concurrent test: The number of insert requests is 47650, the number of delete requests is 48070. Consider that each insert request inserts 200 entities, and each delete request deletes 100 entities. Theoretically, the total amount of data should be increased by 20047650 - 10048070 = 4,723,000. But the count(*) returns 85,669,283. In other words, there are 2,448,568 entities are missing ? ?

    c.name
    'laion_stable_4'
    c.query('id >= 0', output_fields=["count(*)"], consistency_level="Strong")
    [{'count(*)': 85669283}]
    [2024-03-29 06:00:21,479 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
    [2024-03-29 06:00:21,479 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
    [2024-03-29 06:00:21,479 -  INFO - fouram]: grpc     delete                                                                         48070     0(0.00%) |   4379       9   16405   3900 |    4.45        0.00 (stats.py:789)
    [2024-03-29 06:00:21,479 -  INFO - fouram]: grpc     insert                                                                         47650     0(0.00%) |   5084     907   17238   4600 |    4.41        0.00 (stats.py:789)
    [2024-03-29 06:00:21,479 -  INFO - fouram]: grpc     search                                                                         21591     0(0.00%) |   3930     267   11644   4000 |    2.00        0.00 (stats.py:789)
    [2024-03-29 06:00:21,480 -  INFO - fouram]:          Aggregated                                                                    117311     0(0.00%) |   4583       9   17238   4200 |   10.86        0.00 (stats.py:789)

Expected Behavior

No response

Steps To Reproduce

- argo: https://argo-workflows.zilliz.cc/archived-workflows/qa/c1e45c0a-174e-4518-9af3-a9f2eb62ac3c?nodeId=laion1b-test-cron-to-100m

Milvus Log

Anything else?

No response

xiaofan-luan commented 6 months ago

/assign @aoiasd

aoiasd commented 6 months ago

May relate https://github.com/milvus-io/milvus/issues/31548, not decreased but increased and resumed. The wrong count result was before test, but the bug may resume after reload,cause seems data count decreased.