milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
31.05k stars 2.95k forks source link

[Bug]: Collection with TTL doesn't work #21802

Closed ThomasAlxDmy closed 1 year ago

ThomasAlxDmy commented 1 year ago

Is there an existing issue for this?

Environment

- Milvus version: 2.2.2
- Deployment mode(standalone or cluster): cluster AWS (not on K8s)
- MQ type(rocksmq, pulsar or kafka): pulsar
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus-2.1.0.dev98
- OS(Ubuntu or CentOS): CentOS
- CPU/Memory: c5ad.8xlarge => 32 VCPU, 64GB
- GPU: 
- Others:

Current Behavior

Creating a collection with TTL does not expire any items (collection size should be around 12Million but is now 64 million and still growing after 12h)

collection = Collection(name=collection_name, shards_num=30, schema=schema, properties={"collection.ttl.seconds": 10800})

creates the collection but it doesn't seems it created it with TTL.

Setting TTL fails..

collection = Collection(collection_name)
collection.set_properties(properties={"collection.ttl.seconds": 10800})
Traceback (most recent call last):
  File "create_collection.py", line 57, in <module>
    collection.set_properties(properties={"collection.ttl.seconds": 10800})
AttributeError: 'Collection' object has no attribute 'set_properties'

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

yanliang567 commented 1 year ago

@ThomasAlxDmy I think you need to upgrade the pymilvus to 2.2.1, as TTL is a new feature since pymilvus 2.2.0.

/assign @ThomasAlxDmy

xiaofan-luan commented 1 year ago

Exactly, ttl only works for newest SDK and Milvus 2.2.0+ We highly recommend to use latest SDK for test

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

ThomasAlxDmy commented 1 year ago

Hey all I looked into it again with pymilvus==2.2.3 and problem is the same. Created a new collection 7 days ago and I see Approx Entity Count: 346,132,883. But I have not been inserting data into it for the last 3 days and the collection does not shrink in size.

Can I see the ttl properties on the collection somewhere?

Going to do a new experiment with 24h collection:

python3 tools/ingest/create_collection.py
Creating collection: embeddings_mini_lm_l6v2_short with properties:{'collection.ttl.seconds': 86400}
Success!
index creation Status(code=0, message=)
xiaofan-luan commented 1 year ago

Hey all I looked into it again with pymilvus==2.2.3 and problem is the same. Created a new collection 7 days ago and I see Approx Entity Count: 346,132,883. But I have not been inserting data into it for the last 3 days and the collection does not shrink in size.

Can I see the ttl properties on the collection somewhere?

Going to do a new experiment with 24h collection:

python3 tools/ingest/create_collection.py
Creating collection: embeddings_mini_lm_l6v2_short with properties:{'collection.ttl.seconds': 86400}
Success!
index creation Status(code=0, message=)

can you trigger a compaction manually see if it works? My guess would be we need some better estimation on expired data size to trigger data expiration

filip-halt commented 1 year ago

Just checking in to see if you are still running into this issue.

ThomasAlxDmy commented 1 year ago

Are you recommending to call in python collection.compact ?

filip-halt commented 1 year ago

Are you recommending to call in python collection.compact ?

Yes

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

xiaofan-luan commented 1 year ago

this issue should be fixed by latest milvus

ThomasAlxDmy commented 1 year ago

confirmed it's fixed!