milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.04k stars 2.88k forks source link

[Bug]: Duplicate data is automatically added #34121

Closed SunilWang closed 3 months ago

SunilWang commented 3 months ago

Is there an existing issue for this?

Environment

- Milvus version:2.4.5
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2): "@zilliz/milvus2-sdk-node": "^2.4.3",
- OS(Ubuntu or CentOS): CentOS
- CPU/Memory: cpu 80核、128GB
- GPU: 3090
- Others:

Current Behavior

Two things can happen:

  1. After I inserted 256 pieces of data with Node.js SDK, and then deleted all the data and inserted the data again, there would be a lot of repetitive data.
  2. What is even weirder is that I created a Collection with the same structure as others in the same database, and as soon as it was established, new data would be generated constantly!! I'm not doing any data insertion.

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

github-actions[bot] commented 3 months ago

The title and description of this issue contains Chinese. Please use English to describe your issue.

xiaofan-luan commented 3 months ago

could you just show your code?

xiaofan-luan commented 3 months ago

My guess is your just have multi process try to insert into the same collection but you don't aware of that

yanliang567 commented 3 months ago

@SunilWang is the milvus just new created? Could you please refer this doc to export the whole Milvus logs for investigation? For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs. Also share the collection names with the problem will be helpful for us to address the issue.

/assign @SunilWang

SunilWang commented 3 months ago

@yanliang567 I used the visualization tool zilliz/attu to create the Collection

SunilWang commented 3 months ago

@xiaofan-luan When my program has stopped, the amount of data continues to increase

async function handleOne(task: AigcTaskList) {
    try {
        const generateImgs = task.generateImgs
        const rowData = []

        for (let i = 0; i < generateImgs.length; i++) {
            const imgUrl = generateImgs[i]
            // task.generateVideos

            const { data } = await milvusClient.query({
                collection_name: collection_name,
                filter: `imgUrl == "${imgUrl}"`,
                output_fields: ['id', 'vector', 'taskId', 'taskIndex', 'imgUrl', 'video'],
                // output_fields: ['id', 'vector', 'imgUrl'],
            })

            if(data.length > 0){
                continue
            }

            const vector = await getImageVector(imgUrl)

            const res = await milvusClient.insert({
                collection_name: collection_name,
                data: [{
                    id: `${task.id}_${i}`,
                    taskId: task.id,
                    vector,
                    imgUrl,
                    taskIndex: i,
                    video: '',
                }],
            })
        }
    }catch (error){
        console.log(error)
    }
}
yanliang567 commented 3 months ago

@yanliang567 I used the visualization tool zilliz/attu to create the Collection

how you deploy milvus, and try to export and share the logs

SunilWang commented 3 months ago

@yanliang567

[2024/06/25 03:27:21.021 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:27:21.021 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=e33a36544b85782d25206d2be5357caf] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700520451211269}]"]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450612404343019873] [indexName=] [timestamp=0]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545230] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545314] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450612404343020293] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450612404343019873] [indexName=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450697823444118235] [indexName=] [timestamp=0]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118802] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118430] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450697823444118235] [indexName=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450612404344463315] [indexName=] [timestamp=0]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404344463315] [indexID=450612404344463454] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450612404344463315] [indexName=]
[2024/06/25 03:27:23.177 +00:00] [INFO] [datacoord/services.go:197] ["handle assign segment request"] [traceID=abea55c8e6f7dc601c5f379ade63f0c5] [collectionID=450697823444118235] [partitionID=450697823444118236] [channelName=by-dev-rootcoord-dml_2_450697823444118235v0] [count=1] ["segment level"=Legacy]
[2024/06/25 03:27:23.177 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:27:23.177 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=abea55c8e6f7dc601c5f379ade63f0c5] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700521014820870}]"]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450612404343019873] [indexName=] [timestamp=0]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450612404343020293] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545230] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545314] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450612404343019873] [indexName=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450612404344463315] [indexName=] [timestamp=0]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404344463315] [indexID=450612404344463454] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450612404344463315] [indexName=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450697823444118235] [indexName=] [timestamp=0]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118430] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118802] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450697823444118235] [indexName=]
[2024/06/25 03:27:25.432 +00:00] [INFO] [datacoord/services.go:197] ["handle assign segment request"] [traceID=7ff69fb866f2be9e53a802c978dc51ee] [collectionID=450697823444118235] [partitionID=450697823444118236] [channelName=by-dev-rootcoord-dml_2_450697823444118235v0] [count=1] ["segment level"=Legacy]
[2024/06/25 03:27:25.433 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:27:25.433 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=7ff69fb866f2be9e53a802c978dc51ee] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700521604907013}]"]
[2024/06/25 03:27:25.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=7e46127d7b578fa4d7954505cc328808] [collectionID=450612404344463315]
[2024/06/25 03:27:25.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=7ce0a70d1d9bb19cd2b49a8e24a364cc] [collectionID=450612404344463315]
[2024/06/25 03:27:25.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=89fd0cd6710bae455952a5f77f2c83f6] [collectionID=450697823444118235]
[2024/06/25 03:27:25.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=1cd35e0c15d761d84d74fb05f388d323] [collectionID=450612404343019873]
[2024/06/25 03:27:25.924 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=6942d0e9237537ab322c870bdcca769e] [collectionID=450612404344463315] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286035924679317]
[2024/06/25 03:27:25.924 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286025925894218] [newVersion=1719286035924679317] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [observers/target_observer.go:493] ["observer trigger update current target"] [collectionID=450612404344463315]
[2024/06/25 03:27:25.925 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=d3d725d6ab3d842e9286c95808cd2457] [collectionID=450612404343019873] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286035924743885]
[2024/06/25 03:27:25.925 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286025925844734] [newVersion=1719286035924743885] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=640f7187fefe819280a49eec08ea8f5c] [collectionID=450612404344463315] [partitionIDs="[]"]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450612404344463315] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [posTs=450700517082660866] [posTime=2024/06/25 03:27:10.131 +00:00]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=640f7187fefe819280a49eec08ea8f5c] [collectionID=450612404344463315] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=718809d78e1a97f8580458949f7c53c8] [collectionID=450697823444118235] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286035924849768]
[2024/06/25 03:27:25.925 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286025925918709] [newVersion=1719286035924849768] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=58469440cfb6716568516a0a1cf71a52] [collectionID=450612404343019873] [partitionIDs="[]"]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450612404343019873] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [posTs=450700517082660866] [posTime=2024/06/25 03:27:10.131 +00:00]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=58469440cfb6716568516a0a1cf71a52] [collectionID=450612404343019873] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=b98a71d4f3b8cb61dddce3c4dd7f37ef] [collectionID=450697823444118235]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=c26d22bc65eba4708c1b859df1ee3e36] [collectionID=450697823444118235] [partitionIDs="[]"]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450697823444118235] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:27:25.926 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [posTs=450700484668817410] [posTime=2024/06/25 03:25:06.482 +00:00]
[2024/06/25 03:27:25.926 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=c26d22bc65eba4708c1b859df1ee3e36] [collectionID=450697823444118235] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:27:25.927 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=0778caed8f23fea228e6759891e72fb0] [collectionID=450612404343019873]
[2024/06/25 03:27:27.723 +00:00] [INFO] [datacoord/services.go:197] ["handle assign segment request"] [traceID=3f8ed8088194cae1be8a957f88f50e93] [collectionID=450697823444118235] [partitionID=450697823444118236] [channelName=by-dev-rootcoord-dml_2_450697823444118235v0] [count=1] ["segment level"=Legacy]
[2024/06/25 03:27:27.724 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:27:27.724 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=3f8ed8088194cae1be8a957f88f50e93] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700522207576069}]"]
SunilWang commented 3 months ago
[2024/06/25 03:29:04.703 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:29:04.703 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=cb73a5b4728f0032095060028d6d5778] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700547635806213}]"]
[2024/06/25 03:29:05.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=5413d7cc64f959bc496467442fc9f85c] [collectionID=450612404344463315]
[2024/06/25 03:29:05.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=2525a6ac820def73cdc3e336a10933d8] [collectionID=450612404344463315]
[2024/06/25 03:29:05.925 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=094b5eaffc38e26fbc1c4c9f19b89fc6] [collectionID=450612404344463315] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286135925850912]
[2024/06/25 03:29:05.925 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286125926009796] [newVersion=1719286135925850912] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:29:05.925 +00:00] [INFO] [observers/target_observer.go:493] ["observer trigger update current target"] [collectionID=450612404344463315]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=ccc5c6bb41d15870190ce77b6e1930ca] [collectionID=450697823444118235]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=10b08654249cc3493eb1136b6b16ad9f] [collectionID=450612404343019873]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=448e9aa90174d59f4949f19701b46d05] [collectionID=450612404344463315] [partitionIDs="[]"]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450612404344463315] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [posTs=450700532837515266] [posTime=2024/06/25 03:28:10.231 +00:00]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=448e9aa90174d59f4949f19701b46d05] [collectionID=450612404344463315] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=96010468c384314b47a3b627c39590ef] [collectionID=450697823444118235] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286135925395773]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=0b23b9e2e46066465620c92406651691] [collectionID=450697823444118235]
[2024/06/25 03:29:05.926 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286125926088090] [newVersion=1719286135925395773] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=7dd0718ff12534065923ef3648403371] [collectionID=450612404343019873] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286135925790361]
[2024/06/25 03:29:05.926 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286125925845758] [newVersion=1719286135925790361] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=1aa3ec0ad2c5fcec5fe69448f89ebb9d] [collectionID=450612404343019873] [partitionIDs="[]"]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450612404343019873] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [posTs=450700532837515266] [posTime=2024/06/25 03:28:10.231 +00:00]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=1f2d93777c8f72df2d699d78b3ff1ea6] [collectionID=450697823444118235] [partitionIDs="[]"]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=1aa3ec0ad2c5fcec5fe69448f89ebb9d] [collectionID=450612404343019873] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450697823444118235] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [posTs=450700484668817410] [posTime=2024/06/25 03:25:06.482 +00:00]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=1f2d93777c8f72df2d699d78b3ff1ea6] [collectionID=450697823444118235] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:29:05.927 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=630c603f6a4b0cbe80753c2edcdcc5ff] [collectionID=450612404343019873]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450612404343019873] [indexName=] [timestamp=0]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450612404343020293] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545230] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545314] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450612404343019873] [indexName=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450697823444118235] [indexName=] [timestamp=0]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118430] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118802] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450697823444118235] [indexName=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450612404344463315] [indexName=] [timestamp=0]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404344463315] [indexID=450612404344463454] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450612404344463315] [indexName=]
[2024/06/25 03:29:06.949 +00:00] [INFO] [datacoord/services.go:197] ["handle assign segment request"] [traceID=adf7ff55d3cde56f2acc1624bcd7724f] [collectionID=450697823444118235] [partitionID=450697823444118236] [channelName=by-dev-rootcoord-dml_2_450697823444118235v0] [count=1] ["segment level"=Legacy]
SunilWang commented 3 months ago

Is it possible that my consistency has something to do with it? I'm the default: Bounded

SunilWang commented 3 months ago

Duplicate data is found that the vector value will change.

My type is: Float Vector (512)

Vector value: [-0.11212158203125,-0.408447265625,-0.257568359375,0.2296142578125,0.04913330078125,0.2054443359375,0.60546875,0.1519775390625,0.42236328125,-0.1748046875,0.418212890625,-0.325439453125,0.125732421875, ...]

Why is it that when you insert data, you generate multiple arrays of different vectors?

SunilWang commented 3 months ago

image

I can confirm that I inserted only one piece of data

SunilWang commented 3 months ago

image

yanliang567 commented 3 months ago

@SunilWang could you please update the log level to debug, reproduce the issue and export the full milvus logs as as commented above?

SunilWang commented 3 months ago

@yanliang567 How can I get the log to you

yanliang567 commented 3 months ago

if it is big, you could share it in a cloud driver, or send it to me(yanliang.qiao@zilliz.com)

SunilWang commented 3 months ago
# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

I am deploying by script, how do I change the log level?

yanliang567 commented 3 months ago
# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

I am deploying by script, how do I change the log level?

use docker logs export to a file

SunilWang commented 3 months ago
# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

I am deploying by script, how do I change the log level?我是按脚本部署的,如何更改日志级别?

use docker logs export to a file使用 Docker 日志导出到文件

The log has been sent to your email address. Please check

LoveEachDay commented 3 months ago

You can use the following restful request to change the log level to info:

curl -X PUT -H "Content-Type: application/json" localhost:9091/log/level -d '{"level": "info"}'

then verify the log level changes:

curl -i http://localhost:9091/log/level
SunilWang commented 3 months ago

You can use the following restful request to change the log level to info:

curl -X PUT -H "Content-Type: application/json" localhost:9091/log/level -d '{"level": "info"}'

then verify the log level changes:

curl -i http://localhost:9091/log/level

The current log level is info by default. Do you need to change it

xiaofan-luan commented 3 months ago

image

I can confirm that I inserted only one piece of data

both ids and vectors are different.. I do believe you will need to check your deployment.

xiaofan-luan commented 3 months ago

unless there is more clues. I don't think this could be issue a milvus, please investigate on your self. you can enable access log and should see multi insert. monitoring system should also illustrate that

SunilWang commented 3 months ago

image I can confirm that I inserted only one piece of data我可以确认我只插入了一条数据

both ids and vectors are different.. I do believe you will need to check your deployment.ID 和矢量都是不同的。我相信你需要检查你的部署。

# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

I am deploying by script, there must be only one node. The way I connect is also ip plus port:10.253.xxx.xxx:19530

xiaofan-luan commented 3 months ago

again this is not related to milvus and the way milvus is deployed. please read and check your code how you write to milvus carefully, especially how many times you call handleOne fucntion

SunilWang commented 3 months ago

again this is not related to milvus and the way milvus is deployed.同样,这与 Milvus 和 Milvus 的部署方式无关。 please read and check your code how you write to milvus carefully, especially how many times you call handleOne fucntion请仔细阅读并检查你的代码是如何写入 milvus 的,尤其是你调用 handleOne 函数的次数

I printed the console log, and I can confirm that the handleOne function was only saved once for the same vector.

xiaofan-luan commented 3 months ago

you can add a log on each of the line

  1. how much image do you have
  2. how many milvusClient.insert did you called.
  3. if you stop all the test, can you still see any insertion to milvus? can you still see entity number increase

guess you just need to have some patience and debug. there is no magic here.

xiaofan-luan commented 3 months ago

256 row is a very small amount of data and it didn't even trigger compaction

SunilWang commented 3 months ago

you can add a log on each of the line您可以在每一行上添加日志

  1. how much image do you have你有多少图像
  2. how many milvusClient.insert did you called.你调用了多少个 milvusClient.insert。
  3. if you stop all the test, can you still see any insertion to milvus? can you still see entity number increase如果你停止了所有的测试,你还能看到任何插入到Milvus的东西吗?你还能看到实体数量增加吗

guess you just need to have some patience and debug. there is no magic here.猜你只需要有一些耐心和调试。这里没有魔法。

milvusClient.insert is indeed only called 256 times. I have sent the complete log to (yanliang.qiao@zilliz.com)

yanliang567 commented 3 months ago

@SunilWang let's double confirm a few things:

  1. do you mean when you stop the client insert scripts, the num entities of the collection is still growing? how did you observe the increase?
  2. which collection has this issue? share some names
  3. when did you start the insert script, and when did yo stop it? share a rough timeline please
SunilWang commented 3 months ago

@SunilWang let's double confirm a few things:让我们再次确认几件事:

  1. do you mean when you stop the client insert scripts, the num entities of the collection is still growing? how did you observe the increase?您的意思是当您停止客户端插入脚本时,集合的实体数仍在增长吗?您是如何观察到这种增长的?
  2. which collection has this issue? share some names哪个集合有这个问题?共享一些名称
  3. when did you start the insert script, and when did yo stop it? share a rough timeline please你什么时候开始插入脚本的,你什么时候停止的?请分享一个粗略的时间表

1.We can see the data growth through the visual tool attu. 2.databases:testMaLiangVault Collection: testMaLiangVault 3.The exact time is forgotten, around June 25, 2024, from 2 p.m. to 6 p.m

image
yanliang567 commented 3 months ago

I did not find the collection name testMaLiangVault, and i am not able to reproduce the issue in house.

SunilWang commented 3 months ago

I did not find the collection name testMaLiangVault, and i am not able to reproduce the issue in house.我没有找到集合名称testMaLiangVault,并且我无法在内部重现该问题。

Thank you very much for your technical support during this time.

My approach is to clean up duplicate data with a scheduled task, which temporarily solves the problem for now.

You can close the current issue.