is00hcw / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Exception in CentralFreeList::RemoveRange->FetchFromSpans() #576

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The exception is reproduced in test application with static linking to tcmalloc 
from gperftools 2.1. 

I am testing tcmalloc on Windows 7 x64. I am compiling with Intel Compiler 14.
Exception occurs in x86 and x64 release builds with /O2 compiler switch. With 
/Od all is ok.
I suspect it is due to race condition.

Here information about exception:
int CentralFreeList::RemoveRange(void **start, void **end, int N) {
...
  if (tail != NULL) {
    SLL_SetNext(tail, NULL);
    head = tail;
    result = 1;
    while (result < N) {
      void *t = FetchFromSpans(); <== Exception is here !
      if (!t) break;
      SLL_Push(&head, t);
      result++;
    }
  }
  lock_.Unlock();
  *start = head;
  *end = tail;
  return result;
}

      void *t = FetchFromSpans();
013EFA7D  lea         edx,[edi+20h]  
013EFA80  cmp         edx,dword ptr [edx+8]  
013EFA83  je          tcmalloc::CentralFreeList::RemoveRange+15Ah (13EFA4Ah)  
013EFA85  mov         dword ptr [esp+4],esi  
013EFA89  mov         dword ptr [esp],ecx  
013EFA8C  mov         edx,dword ptr [edi+28h]  
013EFA8F  mov         ecx,dword ptr [edx+10h]  
013EFA92  inc         word ptr [edx+14h]  
013EFA96  mov         ebp,dword ptr [ecx]  <== Exception here, ecx is 0 !
013EFA98  test        ebp,ebp  
013EFA9A  mov         dword ptr [edx+10h],ebp  
013EFA9D  je          tcmalloc::CentralFreeList::RemoveRange+1CFh (13EFABFh)  
013EFA9F  mov         ebp,dword ptr [edi+3Ch]  
013EFAA2  dec         ebp  
013EFAA3  mov         dword ptr [edi+3Ch],ebp  

Inlined function:
void* CentralFreeList::FetchFromSpans() {
  if (tcmalloc::DLL_IsEmpty(&nonempty_)) return NULL;
  Span* span = nonempty_.next;

  ASSERT(span->objects != NULL);
  span->refcount++;
  void* result = span->objects;
  span->objects = *(reinterpret_cast<void**>(result)); <== Exception is here, result=0 !
  if (span->objects == NULL) {
    // Move to empty list
    tcmalloc::DLL_Remove(span);
    tcmalloc::DLL_Prepend(&empty_, span);
    Event(span, 'E', 0);
  }
  counter_--;
  return result;
}

Stack:
Test.exe!RemoveRange - central_freelist.cc:269
Test.exe!FetchFromCentralCache - thread_cache.cc:159
Test.exe!do_malloc_no_errno - tcmalloc.cc:1095
Test.exe!do_realloc_with_callback - tcmalloc.cc:1271
Test.exe!tc_realloc - tcmalloc.cc:1597

realloc() is called to decrease size of prevously allocated buffer.

Original issue reported on code.google.com by zndmi...@gmail.com on 26 Sep 2013 at 10:20

GoogleCodeExporter commented 9 years ago
Please attach test program. Otherwise it's very hard to help you to help us.

Original comment by alkondratenko on 29 Sep 2013 at 2:43

GoogleCodeExporter commented 9 years ago
Unfortunately, I cannot attach test program, because it is proprietary software 
.

However I did some debugging and I can tell now that bug is somehow related to 
Intel compiler optimization and tcmalloc, because problem occurs in 
multithreaded program and exception does not happen when "volatile" qualifier 
is added to "next" pointer within Span struct.

struct Span {
...
  Span* volatile        next;           // Used when in link list
..
}

Hope this will help.

Original comment by zndmi...@gmail.com on 2 Oct 2013 at 9:58

GoogleCodeExporter commented 9 years ago
That's somewhat helpful but not too much.

Are you seeing it just with intel compiler ?

volatile seems to hint at compiler reordering load of next pointer before lock 
is taken. And you're on windows where we're using msvc compiler intrinsic-s for 
atomic ops. And _maybe_ (I'll have to double check that later) some those do 
not imply barrier for compiler.

If you're sufficiently familiar with the subject I invite you to take a look at 
Spinlock::Lock method and it's invocation of Acquire_CompareAndSwap (as 
implemented on windows).

Another potential issue is stronger aliasing assumptions of intel compiler. 
I.e. SLL_ methods are casting pointers in a way that appears to be strict 
aliasing unsafe.

Original comment by alkondratenko on 2 Oct 2013 at 10:34

GoogleCodeExporter commented 9 years ago
I see that only with Intel compiler.

Actually after additional debugging I realized that it was a bug in Intel 
Compiler 14.
Compiler incorrectly inserts the inlined FetchFromSpans() function within loop.

Original comment by zndmi...@gmail.com on 12 Oct 2013 at 2:59

GoogleCodeExporter commented 9 years ago
Closing then.

I think there's still a bit of mystery around barrier-ness of those compiler 
intrinsics. I.e. if intel folks actually believe they're free to move code 
around that call or not. And if their belief is fair.

Thanks for raising this again.

Original comment by alkondratenko on 12 Oct 2013 at 9:31

GoogleCodeExporter commented 9 years ago
The problems was not barrier-ness of compiler intrinsics in this case. 
The bug was that Intel Compiler 14 optimizer moved check in FetchFromSpans()

if (tcmalloc::DLL_IsEmpty(&nonempty_)) return NULL;

out of this loop

   while (result < N) {
      void *t = FetchFromSpans(); <== Exception is here !
      if (!t) break;
      SLL_Push(&head, t);
      result++;
    }

while inlining FetchFromSpans() into the loop.
So that check is called only before first call to FetchFromSpans().
Exception occurs when FetchFromSpans() is called second time from a loop when 
nonempty list is already empty and this is not checked because 
DLL_IsEmpty(&nonempty_) is not called.

Hope this clarifies a bit mistery with this case.

Original comment by zndmi...@gmail.com on 21 Oct 2013 at 11:27