awnumar / memguard

Secure software enclave for storage of sensitive information in memory.
Apache License 2.0
2.49k stars 124 forks source link

Removes drop based finalizer #157

Closed EthanHeilman closed 1 month ago

EthanHeilman commented 2 months ago

This PR addresses an issue in memguard where the go garbage collector (GC) will trigger the finalizer on a LockedBuffer, zeroing out and freeing this buffer while the running code may still have a pointer to this buffer. This can result in the code being run against partially or fully zero'd out memory.

This issue was originally privately disclosed to memguard and it was agreed that due to the minor security impact, a public PR should be opened to track one possible solution to this problem.

The Fix

We fix this issue by removing the finalizer from the LockedBuffer. To wipe the LockedBuffer and free the associated memory the developer must call Destroy(). This is a reasonable fix because memguard provides additional security at the cost of requiring the developer managing this memory. The finalizer on the LockedBuffer mixes the approach of placing the responsibility on the developer with sometimes having the go GC also handle this responsibility.

This fix simplifies this so that the developer is always responsible for calling Destroy() on their LockedBuffer.

The Issue

Consider the following code:

  dataToLock := []byte(`abcdefghijklmnopqrstuvwxyz`)
  lb := memguard.NewBufferFromBytes(dataToLock)
  lbBytes := lb.Bytes()
  for i := 1; i < 30000; i++ {
      fmt.Printf("i=%d, len(lbBytes)=%d, lbBytes=%d\n", i, len(lbBytes), lbBytes)
      if lbBytes[0] != byte(97) {
          fmt.Printf("error: i=%d, len(lbBytes)=%d, lbBytes=%d\n", i, len(lbBytes), lbBytes)
      }
   }

At some point in the for loop lbBytes will eventually no longer equal 'a' (97) but will have its value changed to 0x0. Soon afterwards attempts to write to lbBytes will result in a fault.

i=9837, len(lbBytes)=26, lbBytes=[97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122] unexpected fault address 0x29af8971fe6 fatal error: fault [signal 0xc0000005 code=0x0 addr=0x29af8971fe6 pc=0x28fb0e]

Cause

When memguard creates a LockedBuffer, it associates a 16 byte value named drop with the LockedBuffer. This drop value acts as a standin for the data in the LockedBuffer so that when the go garbage collector (GC) detects there are zero references to the drop it calls the custom finalizer on the drop value and then the finalizer called Destroy on the LockedBuffer which in turn deletes the data in the LockedBuffer and then frees the page.

https://github.com/awnumar/memguard/blob/master/buffer.go#L23C1-L35C2

In the above code, once you enter the for loop, the LockedBuffer is out of a scope and this means the drop is out of scope as well. When the Garbage Collector does a sweep, it will call the custom finalizer on the drop, which will in turn delete the buffer, this will zero out lbBytes and then deallocate the memory for lbBytes.

Since the timing of this depends on the GC (garbage collector) this can result in: (a). no error (most of the time) : (b). a memory fault error (sometimes) , (c). the buffer being fully or partially overwritten with zeros (rarely).

(a) and (b) are likely not at all security critical, but the randomness and rareness of this occurring makes it a tricky problem to debug. It is case (c) that could represent a security issue.

(c) could be understood as working as intended since by zeroing out the buffer when the LockedBuffer goes out of scope memguard achieves the feature: "Accidental memory leaks are mitigated against by harnessing the garbage-collector to automatically destroy containers that have become unreachable". However it can have a limited security impact (see below).

Security impact

If memguard LockedBuffers are used for keying material then encrypting a message with a secret key which is partially or fully changed to zero may result in a loss of security. It may not be clear to an implementer the security difference between:

// This is insecure, very rarely you might get an all zero key
key := lb.Bytes()
ct := Encrypt(key, msg)
// This is secure
ct := Encrypt(lb.Bytes(), msg)

I consider the security impact here to be minimal because:

  1. Most projects create a LockedBuffer that lives in a struct that exists for the lifetime of the process so GC is never called until the process ends. Looking at all the projects that use memguard on github I was unable to find a single one that this behavior would impact.
  2. If they are working with ephemeral secrets then they should be seting defer lb.Destroy() this prevents the GC from eating the buffer. The defer keeps a reference to lb and GC isn't called until it leaves the function. I experimentally tested this.
  3. If they are not working with ephemeral secrets, but instead using the LockedBuffer to manage secrets needed through out the lifetime of the process, then the LockedBuffer shouldn't be garbage collected at all.

Broken Tests

The github action tests are failing, however the unittests work when I run go test ./... locally. Looking at the project commit history these tests have been failing for several months. I don't think these test failure are related to my PR.