vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.19k stars 2.06k forks source link

Flaky vreplication tests due to race in resetting the global query channel. #13092

Open rohit-nayak-ps opened 1 year ago

rohit-nayak-ps commented 1 year ago

Overview of the Issue

The noblob PR https://github.com/vitessio/vitess/pull/12905 has introduced a race during unit tests. This is caused by the re-initialization of globalDBQueries, which we do because we run all tests twice: once for the full image and once for noblob. It turns out that the post-copy vcopier task, which runs in a goroutine, is not scheduled sometimes till we do the reinitialization causing a race.

https://github.com/vitessio/vitess/actions/runs/4989944133/jobs/8937133568

WARNING: DATA RACE
Write at 0x000003c22910 by main goroutine:
  vitess.io/vitess/go/vt/vttablet/tabletmanager/vreplication.setup()
      /home/runner/work/vitess/vitess/go/vt/vttablet/tabletmanager/vreplication/framework_test.go:124 +0x5a
  vitess.io/vitess/go/vt/vttablet/tabletmanager/vreplication.TestMain.func1()
      /home/runner/work/vitess/vitess/go/vt/vttablet/tabletmanager/vreplication/framework_test.go:188 +0x164
  vitess.io/vitess/go/vt/vttablet/tabletmanager/vreplication.TestMain()
      /home/runner/work/vitess/vitess/go/vt/vttablet/tabletmanager/vreplication/framework_test.go:196 +0x53
  main.main()
      _testmain.go:219 +0x324

Previous read at 0x000003c22910 by goroutine 5724:
  vitess.io/vitess/go/vt/vttablet/tabletmanager/vreplication.(*realDBClient).ExecuteFetch()
      /home/runner/work/vitess/vitess/go/vt/vttablet/tabletmanager/vreplication/framework_test.go:471 +0x1be
  vitess.io/vitess/go/vt/vttablet/tabletmanager/vreplication.(*vreplicator).execPostCopyActions.func2()
      /home/runner/work/vitess/vitess/go/vt/vttablet/tabletmanager/vreplication/vreplicator.go:834 +0x2c4
  vitess.io/vitess/go/vt/vttablet/tabletmanager/vreplication.(*vreplicator).execPostCopyActions.func3()
      /home/runner/work/vitess/vitess/go/vt/vttablet/tabletmanager/vreplication/vreplicator.go:846 +0x301

Goroutine 5724 (finished) created at:
  vitess.io/vitess/go/vt/vttablet/tabletmanager/vreplication.(*vreplicator).execPostCopyActions()
      /home/runner/work/vitess/vitess/go/vt/vttablet/tabletmanager/vreplication/vreplicator.go:840 +0x10e5
  vitess.io/vitess/go/vt/vttablet/tabletmanager/vreplication.TestCancelledDeferSecondaryKeys.func3()
      /home/runner/work/vitess/vitess/go/vt/vttablet/tabletmanager/vreplication/vreplicator_test.go:592 +0x104

Reproduction Steps

-

Binary Version

`main`

Operating System and Environment details

-

Log Fragments

-
mattlord commented 10 months ago

@rohit-nayak-ps did we fix this already?