evilsong / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

64bit NoBarrier_Store and NoBarrier_Load on x86 Windows are incorrect #451

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
r148 added implementations for these two functions on x86 Windows, but they 
aren't doing the right thing.  They are currently only getting used as part of 
generating a pseudo-random number for the spinlock implementation, which is 
probably why it wasn't caught. (Note:  The random number will actually be the 
same every time.)

inline Atomic64 NoBarrier_Load(volatile const Atomic64* ptrValue)
{
  Atomic64 value;
  __asm {
    movq mm0, ptrValue; // Use mmx reg for 64-bit atomic moves
    movq value, mm0;
    emms; // Empty mmx state to enable FP registers
  }
  return value;
}

In the code above, the first instruction is moving the value of ptrValue into 
mm0.  Even putting [ptrValue] will not do the right thing here and will emit 
the identical instruction.

This needs to be a double indirection.  Here is the correct code.
inline Atomic64 NoBarrier_Load(volatile const Atomic64* ptrValue)
{
  Atomic64 value;
  __asm {
    mov eax, ptrValue;
    movq mm0, [eax]; // Use mmx reg for 64-bit atomic moves
    movq value, mm0;
    emms; // Empty mmx state to enable FP registers
  }
  return value;
}

Similarly, NoBarrier_Store should be the following.

inline void NoBarrier_Store(volatile Atomic64* ptrValue, Atomic64 value) {
 __asm {
    movq mm0, value;  // Use mmx reg for 64-bit atomic moves
    mov eax, ptrValue;
    movq [eax], mm0;
    emms;            // Empty mmx state to enable FP registers
  }
}

Original issue reported on code.google.com by jimsprin...@gmail.com on 25 Jul 2012 at 1:06

GoogleCodeExporter commented 9 years ago

Original comment by chapp...@gmail.com on 3 Nov 2012 at 4:55

GoogleCodeExporter commented 9 years ago
r181 | chappedm@gmail.com | 2012-11-04 18:08:17 -0500 (Sun, 04 Nov 2012) | 2 
lines

issue-451: Fixed incorrect assembly for 64-bit barrier load and store on 
windows platforms.

Original comment by chapp...@gmail.com on 4 Nov 2012 at 11:08