Intrepid / upc-specification

Automatically exported from code.google.com/p/upc-specification
0 stars 1 forks source link

upc_lock_free: the effect of free-ing a UPC lock held by another thread is undefined #57

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Currently, in version 1.2 of the specification (7.2.4.4), upc_lock_free is 
defined as follows (excerpts):

2 The upc_lock_free function frees all resources associated with the 
dynamically allocated upc_lock_t pointed to by ptr. If ptr is a null pointer, 
no action occurs. Otherwise, if the argument does not match a pointer earlier 
returned by the upc_global_lock alloc or upc_all_lock alloc function, or if the 
lock has been deallocated by a previous call to upc_lock_free, the behavior is 
undefined.

3 upc_lock_free succeeds regardless of whether the referenced lock is currently 
unlocked or currently locked (by any thread).

4 Any subsequent calls to locking functions from any thread using ptr have
undefined effects. This also applies to any thread currently calling upc_lock.

Berkeley has a lock test: upc-tests/bugzilla/locktest.upc, which reads (in 
part) as follows:

  upc_barrier;
  MSG0("*** testing all_alloc/lock/free pounding test...\n");
  upc_barrier;
  for (int i = 0; i < iters; i++) { /* one thread acquires, different one frees
 */
    upc_lock_t *alock = upc_all_lock_alloc();
    if (MYTHREAD == (i%THREADS)) upc_lock(alock);
    upc_barrier;
    if (MYTHREAD == ((i+1)%THREADS)) upc_lock_free(alock);
  }
[...]
upc_barrier;

Free-ing an acquired lock on a different thread than the current lock holder is 
problematic for some implementations (namely GUPC) in that the lock holder 
allocated a lock queue entry (ala MCS locks) when it acquired the lock and 
there is no easy method to deterministically release that resource.  As this 
test proceeds more and more locks will be created and locked, free'd on a 
different thread and those event queue entries won't be reclaimed.  This 
implementation likely violates both the "frees all resources" and 
"upc_lock_free succeeds regardless" parts of the specification above.

The following two proposed change alternatives are described below.

1) Mandate that the caller of upc_lock_free() must be the current holder 
(locker) of the lock.  This has the property of ensuring that the lock is 
free'd only when other threads aren't holding it or waiting on it and of course 
should make it much easier to free all resources associated with the lock.

2) Make the behavior of free-ing a lock currently held by another thread 
undefined.

I happen to prefer #1, because it doesn't leave the behavior as undefined, yet 
avoids the complications of attempting to free a lock potentially held by 
another thread or where other threads may be waiting for the lock to become 
available.

NOTE: in an implementation where upc_lock_free() *recycles* lock pointers, it 
is interesting to consider what the phrase "Any subsequent calls to locking 
functions from any thread using ptr have undefined effects" actually means.  In 
the test above, if a subsequent call to upc_all_alloc() returns the same value 
of ptr as an earlier call and that pointer value is used in a call to 
upc_lock(), is the intended behavior undefined?  Likely not.  (Hopefully unique 
lock ID's will not have to be introduced to resolve this terminology issue.)

Original issue reported on code.google.com by gary.funck on 29 Jun 2012 at 10:46

GoogleCodeExporter commented 9 years ago
After hitting "send", I realized that this can't be guaranteed: "This has the 
property of ensuring that the lock is free'd only when other threads aren't 
[...] waiting on it [...].

Therefore, I'll drop back to the proposed alternative that agrees with the 
subject line.

2) Make the behavior of free-ing a lock currently held by another thread 
undefined.

This reduces to the following defined behavior.

Either the lock must be held by the caller of upc_lock_free() or the lock must 
be in the unlocked state.

Original comment by gary.funck on 29 Jun 2012 at 10:54

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Off-list, Paul noted that upc_all_lock_free() needs to guarantee that all 
threads have called the function before the lock is free'd.  Hopefully, the 
third time's a charm. Here's the revised sketch of a naive implementation:

upc_all_lock_free(upc_lock_t *ptr)
{
  upc_barrier;
  if (upc_threadof(ptr) == MYTHREAD)
     upc_lock_free (ptr);
}

This implementation uses undefined behavior: applying upc_threadof() to a 
pointer to a UPC lock, but for the sake of brevity illustrates one possible 
implementation within a UPC runtime.

Per the proposal, when upc_lock_free() is called, it must be either in the 
unlocked state or owned by the calling thread, which in this example is the 
thread that has affinity to the storage allocated for the lock.  As mentioned, 
my preference is that we require that the lock is in the unlocked state when 
upc_lock_free() is called.  A weaker alternative requirement is that the lock 
may also be currently locked by the calling thread.

Original comment by gary.funck on 2 Jul 2012 at 3:37

GoogleCodeExporter commented 9 years ago
> Free-ing an acquired lock on a different thread than the current lock holder 
is 
> problematic for some implementations (namely GUPC) in that the lock holder 
allocated 
> a lock queue entry (ala MCS locks) when it acquired the lock and there is no 
easy 
> method to deterministically release that resource.  

This sounds like a bug in the GUPC implementation that should be fixed rather 
than changing the spec to accommodate the implementations. There are many 
possible strategies to achieve the specified semantics and ensure the correct 
release of resources with a few additional bits of storage or a level of 
indirection. Note the restrictions already ensure the only possible states of 
the lock being freed are "unlocked, nobody trying to lock" and "locked by one 
thread, who will not release it, nobody else trying". These two states should 
be already be easily discernable from the lock data structure, so I don't 
understand your source of non-determinism.

If the thread performing the lock free is unable to arrange to directly release 
the internal resources associated with the thread holding the lock (why?), then 
it could alternately write a bit somewhere to ensure the thread holding the 
lock eventually releases the storage during some future housekeeping period.

Perhaps you should provide more details of your implementation showing why this 
is not feasible?

Original comment by danbonachea on 3 Aug 2012 at 12:26

GoogleCodeExporter commented 9 years ago
"If the thread performing the lock free is unable to arrange to directly 
release the internal resources associated with the thread holding the lock 
(why?), then it could alternately write a bit somewhere to ensure the thread 
holding the lock eventually releases the storage during some future 
housekeeping period."

The GUPC runtime implementation has two properties relevant to this discussion.
1) There are no active message (RPC) capabilities.  Thus thread T2 cannot 
directly cause code to be run in thread T1 to (for example) to clean up 
resources associated with a locked lock.
2) In the current implementation of MCS locks, unused wait queue entries are 
locally cached in a free list that is chained via conventional "C" pointers.  
Each free list entry contains a PTS that refers to the UPC shared storage that 
has been allocated for the wait queue entry.  For simplicity, the free list 
entries have a configuration defined maximum number of wait queue entries 
(1024).  This configuration limit places an upper bound on the number of locks 
that a given UPC thread can hold locked at a given time.  A upc_unlock() call 
will release the allocated lock wait queue entry back to the local free list.

From my reading of the spec., the only "dangling resource" situation currently 
defined is exactly that exhibited by the test case: thread T2 frees a lock L 
that is currently locked by thread T1 (where T1 != T2).

There may be a slight ambiguity in the spec. where it states the following 
about upc_lock_free:

"Any subsequent calls to locking functions from any thread using ptr have
undefined effects. This also applies to any thread currently calling upc_lock."

I suggest that this be amended to read:

"Any subsequent *or concurrent* calls to locking functions from any thread 
using ptr have undefined effects. This also applies to any thread currently 
calling upc_lock."

*...* marks the additional clarification/restriction.

I propose that following restriction be added:

"An attempt to free a lock currently locked by a thread different from the 
thread calling upc_lock_free has undefined effects."

Original comment by gary.funck on 14 Aug 2012 at 4:52

GoogleCodeExporter commented 9 years ago
""Any subsequent *or concurrent* calls to locking functions from any thread 
using ptr have undefined effects. This also applies to any thread currently 
calling upc_lock.""

This suggestion is already covered/subsumed by Issue 49.

"I propose that following restriction be added:

"An attempt to free a lock currently locked by a thread different from the 
thread calling upc_lock_free has undefined effects.""

This is a significant reversal of the intended behavior, as demonstrated by 
semantic 3, which directly countermands the suggested language:

"upc_lock_free succeeds regardless of whether the referenced lock is currently 
unlocked or currently locked (by any thread)."

I highly recommend you look for other ways to solve this resource leak in GUPC 
- I'm not at all convinced your bug represents any fundamental difficulty that 
motivates a "fix" via restricting user behavior. Why can't the thread freeing 
the lock write a bit to the wait queue entry indicating the lock has been 
freed, which can then be checked and reaped by the appropriate thread next time 
you need a wait queue entry? In other words, the freeing thread which doesn't 
hold the lock enqueues itself as a a special kind of "waiter" which is "waiting 
to free", and later on when the thread that holds the lock is low on resources 
it can scan for and find the "waiting to free" entry which tell it the lock 
should immediately be freed.

In any case, the restriction requested in comment 6 would break backwards 
compatibility, so that defers it to 1.4 or later.

Original comment by danbonachea on 14 Aug 2012 at 5:21

GoogleCodeExporter commented 9 years ago
I agree with Gary that this change would be a good thing, but I agree with Dan 
that Gary's suggested change is too strong to make in UPC 1.3 given that it 
would directly violate Semantic 3 of upc_lock_free().

I consider it to be a general design flaw of UPC that it permits resources 
acquired by one thread to be released by another thread.  For example, in 
addition to changing upc_lock_free(), I would like to see language for 
upc_free() that says that it cannot be used to free memory that a different 
thread has allocated using upc_alloc().

Original comment by johnson....@gmail.com on 14 Aug 2012 at 6:01