mochi-mqtt / server

The fully compliant, embeddable high-performance Go MQTT v5 server for IoT, smarthome, and pubsub
MIT License
1.3k stars 223 forks source link

Race condition when running the redis example #364

Closed donstephan closed 9 months ago

donstephan commented 10 months ago

When running the redis example with the golang -race flag it outputs a data race when it initially starts up. I haven't observed this happening in a live environment but noticed it when I was doing some other debugging.

WARNING: DATA RACE
Write at 0x00c0003280a0 by goroutine 15:
  sync/atomic.StoreInt64()
      /usr/local/go/src/runtime/race_amd64.s:237 +0xb
  sync/atomic.StoreInt64()
      <autogenerated>:1 +0x15
  github.com/mochi-mqtt/server/v2.(*Server).eventLoop()
      /server/server.go:309 +0x41c
  github.com/mochi-mqtt/server/v2.(*Server).Serve.func2()
      /server/server.go:290 +0x33

Previous read at 0x00c0003280a0 by goroutine 14:
  github.com/mochi-mqtt/server/v2/hooks/storage/redis.(*Hook).OnSysInfoTick()
      /server/hooks/storage/redis/redis.go:364 +0x124
  github.com/mochi-mqtt/server/v2.(*Hooks).OnSysInfoTick()
      /server/hooks.go:197 +0x114
  github.com/mochi-mqtt/server/v2.(*Server).publishSysTopics()
      /server/server.go:1384 +0x16f6
  github.com/mochi-mqtt/server/v2.(*Server).Serve()
      /server/server.go:292 +0x2d7
  main.main.func2()
      /server/examples/persistence/redis/main.go:58 +0x2e

Goroutine 15 (running) created at:
  github.com/mochi-mqtt/server/v2.(*Server).Serve()
      /server/server.go:290 +0x22a
  main.main.func2()
      /server/examples/persistence/redis/main.go:58 +0x2e

Goroutine 14 (finished) created at:
  main.main()
      /server/examples/persistence/redis/main.go:57 +0xa04
==================

Clone the repo, start a redis server on localhost:6379 and run:

go run -race ./examples/persistence/redis/main.go

I will provide a fix when I get the time to do so.

donstephan commented 10 months ago

Issue is here: https://github.com/mochi-mqtt/server/blob/65c78534dcb5861dc8ea2163ac58ce603b6e9881/server.go#L290

go s.eventLoop() starts and the the s.publishSysTopics() runs right after it which could race with channels in the event loop. Would be very rare (if not impossible) to hit. If I'm reading it right, you'd somehow need a pause of 1 second between those two function executions to encounter this condition. Easiest fix is to just move the s.publishSysTopics() into the eventLoop() function so it runs before it listens to the tickers. Or it could just be moved above the eventLoop() execution in the server file.

thedevop commented 10 months ago

Thanks @donstephan for reporting this!

The issue is *system.Info was passed into hooks, and in the Redis hook, it directly dereferenced the live info: https://github.com/mochi-mqtt/server/blob/65c78534dcb5861dc8ea2163ac58ce603b6e9881/hooks/storage/redis/redis.go#L364.

I have provided a fix in PR #365, please verify.

donstephan commented 10 months ago

Looks great. Tested the PR with the redis example and no more race. Thank you much!

thedevop commented 9 months ago

@donstephan , this is incorporated into v2.4.6.