Open yujunz opened 4 years ago
I repeated the test for several times and can reproduce the errors as above. There is also one crash observed
=== RUN TestAllRepairs
fatal error: concurrent map writes
goroutine 190 [running]:
runtime.throw(0x12eae35, 0x15)
/usr/local/Cellar/go/1.14.4/libexec/src/runtime/panic.go:1116 +0x72 fp=0xc0001a5df8 sp=0xc0001a5dc8 pc=0x1035092
runtime.mapassign_faststr(0x12984c0, 0xc000109680, 0x12e6813, 0x3, 0xc0001eca48)
/usr/local/Cellar/go/1.14.4/libexec/src/runtime/map_faststr.go:211 +0x3f7 fp=0xc0001a5e60 sp=0xc0001a5df8 pc=0x1014677
github.com/yujunz/roshi/farm.(*mockCluster).Insert(0xc000109650, 0xc000109830, 0x1, 0x1, 0xc00009c2a0, 0x1506b00)
/Users/yujunz/Code/github.com/soundcloud/roshi/farm/mock_cluster_test.go:109 +0x290 fp=0xc0001a5f10 sp=0xc0001a5e60 pc=0x1250580
github.com/yujunz/roshi/farm.(*Farm).Insert.func1(0x1347d40, 0xc000109650, 0xc000109830, 0x1, 0x1, 0x152b8d0, 0x0)
/Users/yujunz/Code/github.com/soundcloud/roshi/farm/farm.go:65 +0x4f fp=0xc0001a5f50 sp=0xc0001a5f10 pc=0x1259abf
github.com/yujunz/roshi/farm.(*Farm).write.func2(0xc000026600, 0x12f79f8, 0xc000109830, 0x1, 0x1, 0x1347d40, 0xc000109650)
/Users/yujunz/Code/github.com/soundcloud/roshi/farm/farm.go:125 +0x62 fp=0xc0001a5fa8 sp=0xc0001a5f50 pc=0x1259c92
runtime.goexit()
/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc0001a5fb0 sp=0xc0001a5fa8 pc=0x1066d71
created by github.com/yujunz/roshi/farm.(*Farm).write
/Users/yujunz/Code/github.com/soundcloud/roshi/farm/farm.go:124 +0x214
goroutine 1 [chan receive]:
testing.(*T).Run(0xc00012c900, 0x12e8bd6, 0xe, 0x12f7a78, 0x1083301)
/usr/local/Cellar/go/1.14.4/libexec/src/testing/testing.go:1043 +0x37e
testing.runTests.func1(0xc00012c480)
/usr/local/Cellar/go/1.14.4/libexec/src/testing/testing.go:1284 +0x78
testing.tRunner(0xc00012c480, 0xc00013fe10)
/usr/local/Cellar/go/1.14.4/libexec/src/testing/testing.go:991 +0xdc
testing.runTests(0xc00000e380, 0x14fda80, 0x14, 0x14, 0x0)
/usr/local/Cellar/go/1.14.4/libexec/src/testing/testing.go:1282 +0x2a7
testing.(*M).Run(0xc000148000, 0x0)
/usr/local/Cellar/go/1.14.4/libexec/src/testing/testing.go:1199 +0x15f
main.main()
_testmain.go:82 +0x135
goroutine 189 [runnable]:
github.com/yujunz/roshi/farm.TestAllRepairs(0xc00012c900)
/Users/yujunz/Code/github.com/soundcloud/roshi/farm/repair_strategies_test.go:37 +0x537
testing.tRunner(0xc00012c900, 0x12f7a78)
/usr/local/Cellar/go/1.14.4/libexec/src/testing/testing.go:991 +0xdc
created by testing.(*T).Run
/usr/local/Cellar/go/1.14.4/libexec/src/testing/testing.go:1042 +0x357
Could it be a problem in the mocked cluster?
It seems caused by race condition in map. Fixed in 96986754212a9d15784a52a04fd7a00887b30b6a by adding mutex.
There is something more going on here. I enabled running the tests in GitHub actions and these tests kept failing (#49). I disabled them, merged, and then realized that I hadn't turned on the race detector (which was on before). Weirdly, with the race detector on, they pass (#50)! Locally for me, they also pass without race detection. Maybe the tests are still timing sensitive, and fail if they run too fast?
I tried to port
roshi
to go 1.14 but failed in some tests related to repairhttps://travis-ci.com/github/yujunz/roshi/builds/176022842#L377
However, running test locally did PASS. What could be the problem?