awnumar / memguard

Secure software enclave for storage of sensitive information in memory.
Apache License 2.0
2.49k stars 124 forks source link

bufferList Data Race Can Occur in Panic #125

Closed jpaskhay closed 4 years ago

jpaskhay commented 4 years ago

Describe the bug Running into a transient data race in our benchmark tests where the bufferList is being written to while another goroutine tries to read it in the Panic function. Looking at the code, the Panic function is the only read performed on the bufferList.list without a read lock.

To Reproduce It seems to be a timing issue in our benchmark, so it's tricky to reliably reproduce. In our specific failure, It seems to occur as a result of an mlock request failing (resulting in Panic being called) at the same time as a write to bufferList from a Buffer.Destroy call. Here are the relevant snippets of the data race stacks:

WARNING: DATA RACE
Read at 0x00c000098798 by goroutine 51:
  github.com/awnumar/memguard/core.Panic()
      /go/pkg/mod/github.com/awnumar/memguard@v0.19.1/core/exit.go:65 +0x133
  github.com/awnumar/memguard/core.NewBuffer()
      /go/pkg/mod/github.com/awnumar/memguard@v0.19.1/core/buffer.go:75 +0x775
  github.com/awnumar/memguard.NewBuffer()
      /go/pkg/mod/github.com/awnumar/memguard@v0.19.1/buffer.go:47 +0x3c
  github.com/awnumar/memguard.NewBufferRandom()
      /go/pkg/mod/github.com/awnumar/memguard@v0.19.1/buffer.go:215 +0x3c

...

Previous write at 0x00c000098798 by goroutine 25:
  github.com/awnumar/memguard/core.(*bufferList).remove()
      /go/pkg/mod/github.com/awnumar/memguard@v0.19.1/core/buffer.go:236 +0x21b
  github.com/awnumar/memguard/core.(*Buffer).Destroy()
      /go/pkg/mod/github.com/awnumar/memguard@v0.19.1/core/buffer.go:178 +0x238

Expected behaviour No data race warning. This may be relatively minor since even after a fix it would still panic, but could still be preferred to avoid the data race.

Screenshots N/A

System (please complete the following information):

Additional context Will submit PR for possible fix.

awnumar commented 4 years ago

Thanks for reporting this.

In the panic function I specifically left out the acquisition of locks since I assumed when panic is called then we have no guarantees about the state of the application and don't want a condition where we deadlock if a dead procedure has the lock.

Looking back at the code, I think this can only happen when the caller has acquired the lock to either the buffer list or the coffer and panics instead of returning. Also we should probably acquire the coffer lock so that we don't have a condition where panic is called during a re-key operation.

Going to attempt to write a unit test to trigger this race.

awnumar commented 4 years ago

Pushed a patch in #127, would welcome your thoughts on it. It should fix the race condition above.

jpaskhay commented 4 years ago

Pushed a patch in #127, would welcome your thoughts on it. It should fix the race condition above.

Cool. Should be able to circle back in the next day or so.