milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.37k stars 2.82k forks source link

[Bug]: Data race for Clustering Compaction it #34523

Closed XuanYang-cn closed 1 week ago

XuanYang-cn commented 2 months ago

Is there an existing issue for this?

Environment

- Milvus version:
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

==================
WARNING: DATA RACE
Write at 0x00c001aa5b60 by goroutine 433:
  github.com/milvus-io/milvus/internal/datacoord.(*clusteringCompactionTask).updateAndSaveTaskMeta()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction_task_clustering.go:486 +0x224
  github.com/milvus-io/milvus/internal/datacoord.(*clusteringCompactionTask).SetNodeID()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction_task_clustering.go:551 +0x89
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).assignNodeIDs()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:600 +0x37a
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).checkCompaction()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:624 +0x235
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).loopCheck()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:358 +0x304
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).start.func2()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:266 +0x33

Previous read at 0x00c001aa5b60 by goroutine 432:
  github.com/milvus-io/milvus/internal/datacoord.(*clusteringCompactionTask).GetPlanID()
      <autogenerated>:1 +0x31
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).doSchedule()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:322 +0x1db
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).loopSchedule()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:341 +0x16d
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).start.func1()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:265 +0x33

Goroutine 433 (running) created at:
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).start()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:266 +0x125
  github.com/milvus-io/milvus/internal/datacoord.(*Server).startDataCoord()
      /home/yangxuan/Github/milvus/internal/datacoord/server.go:402 +0xa1
  github.com/milvus-io/milvus/internal/datacoord.(*Server).Start()
      /home/yangxuan/Github/milvus/internal/datacoord/server.go:392 +0x46
  github.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).start()
      /home/yangxuan/Github/milvus/internal/distributed/datacoord/service.go:214 +0x1f0
  github.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).Run()
      /home/yangxuan/Github/milvus/internal/distributed/datacoord/service.go:261 +0x55
  github.com/milvus-io/milvus/tests/integration.(*MiniClusterV2).Start()
      /home/yangxuan/Github/milvus/tests/integration/minicluster_v2.go:325 +0xad
  github.com/milvus-io/milvus/tests/integration.(*MiniClusterSuite).SetupTest()
      /home/yangxuan/Github/milvus/tests/integration/suite.go:126 +0x4d9
  github.com/milvus-io/milvus/tests/integration/compaction.(*ClusteringCompactionSuite).SetupTest()
      <autogenerated>:1 +0x31
  github.com/stretchr/testify/suite.Run.func1()
      /home/yangxuan/go/pkg/mod/github.com/stretchr/testify@v1.9.0/suite/suite.go:192 +0x2d8
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x261
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Goroutine 432 (running) created at:
  github.com/milvus-io/milvus/internal/datacoord.(*compactionPlanHandler).start()
      /home/yangxuan/Github/milvus/internal/datacoord/compaction.go:265 +0xbc
  github.com/milvus-io/milvus/internal/datacoord.(*Server).startDataCoord()
      /home/yangxuan/Github/milvus/internal/datacoord/server.go:402 +0xa1
  github.com/milvus-io/milvus/internal/datacoord.(*Server).Start()
      /home/yangxuan/Github/milvus/internal/datacoord/server.go:392 +0x46
  github.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).start()
      /home/yangxuan/Github/milvus/internal/distributed/datacoord/service.go:214 +0x1f0
  github.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).Run()
      /home/yangxuan/Github/milvus/internal/distributed/datacoord/service.go:261 +0x55
  github.com/milvus-io/milvus/tests/integration.(*MiniClusterV2).Start()
      /home/yangxuan/Github/milvus/tests/integration/minicluster_v2.go:325 +0xad
  github.com/milvus-io/milvus/tests/integration.(*MiniClusterSuite).SetupTest()
      /home/yangxuan/Github/milvus/tests/integration/suite.go:126 +0x4d9
  github.com/milvus-io/milvus/tests/integration/compaction.(*ClusteringCompactionSuite).SetupTest()
      <autogenerated>:1 +0x31
  github.com/stretchr/testify/suite.Run.func1()
      /home/yangxuan/go/pkg/mod/github.com/stretchr/testify@v1.9.0/suite/suite.go:192 +0x2d8
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x261
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44
==================

Expected Behavior

no data race

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

yanliang567 commented 2 months ago

/assign @XuanYang-cn /unassign

XuanYang-cn commented 1 month ago

/assign @czs007 /unassign

stale[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.