Intrepid / upc-specification

Automatically exported from code.google.com/p/upc-specification
0 stars 1 forks source link

Either deprecate or strengthen upc_addrfield() #107

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Steve's comment, split off from issue 106:

Consider the following code:

shared int *S1;
shared [] int *S2;

S2 = S1;

if ( upc_addrfield(S1) == upc_addrfield(S2) ) {
    printf("Match\n");
}

With the proposed change (as well as UPC 1.2 apparently), it is undefined 
whether or not anything is printed.  We don't require anywhere that 
pointers-to-shared with different types that point to the same object produce 
the same result when passed to upc_addrfield(), merely that such pointers shall 
compare equal.  Since I don't think this is intended, we probably need a 
stronger statement somewhere to make upc_addrfield() useful.

Original issue reported on code.google.com by danbonachea on 27 Feb 2013 at 11:06

GoogleCodeExporter commented 9 years ago
The current semantics of upc_addrfield() (unchanged since UPC 1.0, Feb 2001):

  The upc_addrfield function returns an implementation-defined value reflecting
  the “local address” of the object pointed to by the pointer-to-shared
  argument.
  footnote: This function is used in defining the semantics of pointer-to-shared    
            arithmetic in Section 6.4.2

This has been in place unchanged for 12 years now without complaint. In my 
experience the function itself has very limited utility in real programs, 
outside of printing pointer values for debugging (which should probably be 
handled in a better way). Strengthening the requirement in any way has the 
potential to render current implementations non-compliant with very little 
pay-off, so I don't think we should try to "squeeze" in a change in the next 
three days to make 1.3. 

Placing it on the 1.4 milestone.

Original comment by danbonachea on 27 Feb 2013 at 11:32

GoogleCodeExporter commented 9 years ago
The code fragment in comment #0 calls upc_addrfield on an uninitialized pointer 
value. Lets assume they are statically allocated and not automatic variables, 
as the latter would be a case of garbage-in-garbage-out. S1 and S2 are thus 
initialized to null pointers which are then passed to the function. 
(Incidentally the assignment is also missing a required cast).

Nothing in the specification defines the behavior of upc_addrfield(NULL). We 
could consider strengthening this, but I don't really see a problem with 
leaving it implementation-defined, since it's a rather silly thing for the user 
to do in the first place.

The example I think Steve MEANT to provide was one where S1 and S2 point at an 
actual object, as below. I've also modified one of the pointers types to 
include phase information to make it more interesting.

Example A:
----------
#include <upc.h>
#include <stdio.h>

shared int x;
int main() {
  shared int *S1 = &x;
  shared [10] int *S2 = (shared [10] int *)S1;

  if ( upc_addrfield(S1) == upc_addrfield(S2) ) printf("Match\n");

  if ( upc_addrfield(S1) == upc_addrfield(S1) ) printf("Match\n");

  return 0;
}

So the question at hand is whether upc_addrfield is guaranteed to return the 
same result when passed a pointer to the same location. For the moment lets 
ignore the fact the user could/should just use a direct pointer equality 
operator to test PTS equality.
I don't think the type of the actual argument matters at all in this 
discussion, because standard call conversion will automatically convert any 
argument to (shared void *) before the library function ever sees it. The same 
behavioral question arises when you pass the exact same argument twice, as in 
my second test above.

I believe that for "free-standing" objects that are not part of an array (as 
above), the return value is currently completely implementation-defined (which 
by the way differs from "undefined", in that implementations are required to 
document their behavior). This means a perverse implementation could 
potentially return a different value on each call with the same pointer to a 
free-standing object. We could consider adding stronger requirements here, but 
I don't really see a motivation to do so. There are better ways to test pointer 
equality, and the addrfield of the pointer to such an object is essentially a 
useless piece of information.

Now lets consider upc_addrfield() called on pointers to shared array elements. 
Assume the shared array contains at least two elements with local affinity, 
otherwise it's essentially equivalent to the "free-standing object" case above. 

Example B:
----------
#include <upc.h>
#include <stdio.h>

shared int A[2*THREADS];
int main() {
  shared int *S1 = &A[MYTHREAD];         // an element with affinity to MYTHREAD
  shared int *S2 = &A[MYTHREAD+THREADS]; // any other element with affinity to MYTHREAD
  int *P1 = (int *)S1;
  int *P2 = (int *)S2;

  ptrdiff_t S1addr = (ptrdiff_t) upc_addrfield(S1);
  ptrdiff_t S2addr = (ptrdiff_t) upc_addrfield(S2);

  if ((S2addr - S1addr) == (P2 - P1)*sizeof(int)) printf("Correct.\n"); // directly required by 6.4.2

  if (S2addr == S1addr + (P2 - P1)*sizeof(int)) printf("Correct.\n"); // same equation, mathematically transformed

  if ((ptrdiff_t) upc_addrfield(S2) == S1addr + (P2 - P1)*sizeof(int)) printf("Correct.\n"); // duplicate call to upc_addrfield

  return 0;
}

This example is closely modeled on 6.4.2 to clarify my argument. 
The first test is exactly the requirement imposed by 6.4.2, and must pass on 
all compliant implementations. 
The second test is mathematically identical, I've just added S1addr to both 
sides of the equation and applied standard algebraic simplification. Thus it 
also must succeed in all compliant implementations.
The third test introduces a second "duplicate" call to upc_addrfield(S2), to 
test the case Steve is worried about that might return a different answer. 
However by the equation in the second test, the required return value is 
already completely constrained by the requirements of 6.4.2. Returning a 
different value on the second call to upc_addrfield(S2) would thus be 
non-compliant.

Original comment by danbonachea on 27 Feb 2013 at 12:15

GoogleCodeExporter commented 9 years ago
> The example I think Steve MEANT to provide was one where S1 and S2 point at 
an actual object, as below. I've also modified one of the pointers types to 
include phase information to make it more interesting.

My code fragment was meant to mimic the code fragment in 6.4.2 5.

> I don't think the type of the actual argument matters at all in this 
discussion, because standard call conversion will automatically convert any 
argument to (shared void *) before the library function ever sees it. The same 
behavioral question arises when you pass the exact same argument twice, as in 
my second test above.

Not necessarily.  Nothing prohibits encoding the type in the representation of 
the shared pointer itself (perhaps to have a scaled offset internally and/or 
much better run-time warnings/errors) and preserving the type info when casting 
to a generic pointer-to-shared.

> This means a perverse implementation could potentially return a different 
value on each call with the same pointer to a free-standing object. We could 
consider adding stronger requirements here, but I don't really see a motivation 
to do so. There are better ways to test pointer equality, and the addrfield of 
the pointer to such an object is essentially a useless piece of information.

The strongest motivation I see is that upc_addrfield()+upc_threadof() is the 
only way to print the address of a shared object.  Requiring that all pointers 
to the same object get the same result from upc_addrfield() means that users 
can easily and usefully print the upc_addrfield()+upc_threadof() combination 
when debugging.

> This example is closely modeled on 6.4.2 to clarify my argument. 

And therein lies the problem.  We've ONLY define upc_addrfield() for pointers 
whose referenced type has block size 1 (block size 0 with the proposed fix for 
issue 106).  Consider the following slightly different code:

#include <upc.h>
#include <stdio.h>

shared [2] int A[2*THREADS];
int main() {
  shared int *S1 = &A[2*MYTHREAD];   // an element with affinity to MYTHREAD
  shared int *S2 = &A[2*MYTHREAD+1]; // any other element with affinity to MYTHREAD

  shared [2] int *SB1 = &A[2*MYTHREAD];
  shared [2] int *SB2 = &A[2*MYTHREAD+1];

  int *P1 = (int *)S1;
  int *P2 = (int *)S2;

  int *PB2 = (int *)S1_1;
  int *PB2 = (int *)S2_1;

  ptrdiff_t S1addr = (ptrdiff_t) upc_addrfield(S1);
  ptrdiff_t S2addr = (ptrdiff_t) upc_addrfield(S2);

  ptrdiff_t SB1addr = (ptrdiff_t) upc_addrfield(SB1);
  ptrdiff_t SB2addr = (ptrdiff_t) upc_addrfield(SB2);

  if ((S2addr - S1addr) == (P2 - P1)*sizeof(int)) printf("Correct.\n"); // directly required by 6.4.2

  if (S2addr == S1addr + (P2 - P1)*sizeof(int)) printf("Correct.\n"); // same equation, mathematically transformed

  if ((ptrdiff_t) upc_addrfield(S2) == S1addr + (P2 - P1)*sizeof(int)) printf("Correct.\n"); // duplicate call to upc_addrfield

  if ((S2addr - S1addr) == (P2 - P1)*sizeof(int)) printf("Correct?\n"); // directly required by 6.4.2? NO

  if (S2addr == S1addr + (P2 - P1)*sizeof(int)) printf("Correct?\n"); // same equation, mathematically transformed

  if ((ptrdiff_t) upc_addrfield(S2) == S1addr + (P2 - P1)*sizeof(int)) printf("Correct?\n"); // duplicate call to upc_addrfield

  return 0;
}

Because we use an example with a fixed block size in 6.4.2 5-6, as opposed to 
something similar to 6.4.2 3-4, the results of upc_addrfield() are only 
constrained for pointers whose referenced type has that single block size.  In 
the code above, clearly the S1addr and S2addr are constrained by 6.4.2 4.  But 
what about SB1addr and SB2addr?  The spec does not currently place ANY 
constraints on their values.

Original comment by sdvor...@cray.com on 27 Feb 2013 at 2:37

GoogleCodeExporter commented 9 years ago
>The strongest motivation I see is that upc_addrfield()+upc_threadof() is the 
only way 
> to print the address of a shared object. ..print the 
upc_addrfield()+upc_threadof() 
>combination when debugging.

Yes, this is the debugging motivation I was referring to in comment #1. However 
upc_addrfield is rather poorly designed to suit that purpose, so if that's the 
only usage case we can come up with to motivate the existence of this library 
function, I'd vote to deprecate it and replace it with a function that is 
better designed for debugging pointer values and formatting them for human 
consumption, akin to C99's printf("%p"). 
For a concrete example, see 
http://upc.lbl.gov/docs/user/index.shtml#bupc_dump_shared

> Because we use an example with a fixed block size in 6.4.2 5-6, as opposed to 
something similar to 6.4.2 3-4, the results of 
> upc_addrfield() are only constrained for pointers whose referenced type has 
that single block size.  

upc_addrfield() is a library function, not a built-in operator. Its actual 
argument is converted by C99 function calling rules to a (shared void *) before 
it ever reaches the function. Despite being implicit, this is an ACTUAL type 
conversion - (shared void *) is an incomplete type, but it is an actual static 
type, with NO associated blocking factor. We have some magic clauses in UPC to 
ensure the phase field is preserved when you cast back and forth to (shared 
void *), but otherwise the blocksize really and truly is gone from the type. A 
more explicit (but unnecessarily redundant) way to write the call in 6.4.2 
would be upc_addrfield((shared void *)S1), but whether you write it explicitly 
or not, that's what the call MEANS and the language requires that conversion to 
happen prior to function entry. The function receives an argument value whose 
referenced type is incomplete and has NO blocksize.

Saying the behavior of the function could change based on the static referent 
type of the pointer operand before argument conversion is like saying the 
behavior of C99 free() might change based on whether you pass it a (char *) or 
an (int *). In both cases this information was lost upon implicit conversion to 
(void *) and is simply is not available to the callee as far as the language 
spec is concerned. Therefore, specifying a behavioral property using a 
particular PTS blocking factor in 6.4.2 automatically constrains the behavior 
for any PTS blocking factor.

The fact that some hypothetical implementation might include static type 
information in the pointer representation does not change the fact that 
SEMANTICALLY that information was lost upon type conversion.

Original comment by danbonachea on 27 Feb 2013 at 3:54

GoogleCodeExporter commented 9 years ago
> Yes, this is the debugging motivation I was referring to in comment #1. 
However upc_addrfield is rather poorly designed to suit that purpose, so if 
that's the only usage case we can come up with to motivate the existence of 
this library function, I'd vote to deprecate it and replace it with a function 
that is better designed for debugging pointer values and formatting them for 
human consumption, akin to C99's printf("%p").

I agree that upc_addrfield() is poorly designed.  As far as I can tell, it 
can't be usefully used for anything but debugging.

> The fact that some hypothetical implementation might include static type 
information in the pointer representation does not change the fact that 
SEMANTICALLY that information was lost upon type conversion.

True, the type information was lost on conversion.  But that doesn't mean that 
two pointers-to-shared with different referenced types that point to the same 
object produce the same value when converted to a generic pointer-to-shared.  
The only constraint is that the resulting values shall compare equal.  As you 
noted, the phase may differ.  I argue that nothing prohibits the results of 
upc_addrfield() from differing as well.

As currently defined, upc_addrfield() is useless to UPC programers.  It's 
return value is implementation defined and practically unconstrained.  It's not 
guaranteed to produce the same result on all threads when passed the same 
pointer.  The single constraint on it is defined only for a single block size.  
So why is it in the spec?

Original comment by sdvor...@cray.com on 27 Feb 2013 at 4:18

GoogleCodeExporter commented 9 years ago
> So why is it in the spec?

It predates my involvement, but I suspect it was introduced for symmetry with 
upc_threadof() and upc_phaseof() to "expose" the third logical component of a 
PTS. The original UPC implementation on the T3e used a PTS representation where 
the addrfield was equal to an actual virtual address, so upc_addrfield was 
interchangeable with cast-to-local. Subsequent revisions of the language made 
the pointer representation more opaque to allow for alternate implementation 
strategies, and the function was never removed.

Perhaps Bill or Kathy can provide further insight..

> As currently defined, upc_addrfield() is useless to UPC programers.  It's 
return value is 
> implementation defined and practically unconstrained.  

I think we agree it's useless for production codes and a poorly-designed tool 
for debugging. So instead of quibbling about the degree to which the result is 
or should be implementation-defined, I propose we talk about deprecating it 
entirely and possibly replacing it with a better debugging aid.

As a starting point for discussion, here is the pseudo-spec for the BUPC 
vendor-specific replacement: 
(http://upc.lbl.gov/docs/user/index.shtml#bupc_dump_shared)

----------------------------------------------
The 'bupc_dump_shared' function
Shared pointers in UPC are logically composed of three fields: the address of 
the data that the shared pointer currently points to, the UPC thread on which 
that address is valid, and the 'phase' of the shared pointer (see the official 
UPC language specification for an explanation of shared pointer phase). Our 
version of UPC provides a 'bupc_dump_shared' function that will write a 
description of these fields into a character buffer that the user provides:

    int bupc_dump_shared(shared const void *ptr, char *buf, int maxlen);

Any pointer to a shared type may be passed to this function. The 'maxlen' 
parameter gives the length of the buffer pointed to by 'buf', and this length 
must be at least BUPC_DUMP_MIN_LENGTH, or else -1 is returned, and errno set to 
EINVAL. On success, the function returns 0, The buffer will contain either 
"<NULL>" if the pointer to shared == NULL, or a string of the form

    "<address=0x1234 (addrfield=0x1234), thread=4, phase=1>" 

The 'address' field provides the virtual address for the pointer, while the 
'addrfield' contains the actual contents of the shared pointer's address bits. 
On some configurations these values may be the same (if the full address of the 
pointer can be fit into the address bits), while on others they may be quite 
different (if the address bits store an offset from a base initial address that 
may differ from thread to thread).

Both bupc_dump_shared() and BUPC_DUMP_MIN_LENGTH are visible when any of the 
standard UPC headers (upc.h, upc_relaxed.h, or upc_strict.h) are #included. 

Original comment by danbonachea on 27 Feb 2013 at 4:47

GoogleCodeExporter commented 9 years ago
> I think we agree it's useless for production codes and a poorly-designed tool 
for debugging. So instead of quibbling about the degree to which the result is 
or should be implementation-defined, I propose we talk about deprecating it 
entirely and possibly replacing it with a better debugging aid.

Sounds reasonable.  As part of the deprecation, I'd recommend removing it from 
6.4.2 6.  I suspect it was originally put there to try to imply the contiguity 
property that is the subject of issue 106.  With the new language in that 
comment, I don't know that there's a need for it any more, given the lack of 
any other constraints on upc_addrfield().

Original comment by sdvor...@cray.com on 27 Feb 2013 at 4:54

GoogleCodeExporter commented 9 years ago
> So why is it in the spec?

upc_addrfield is useful primarily for debugging and writing test cases for 
pointer arithmetic.  The test case that Steve uploaded for the pointer to array 
of shared issue would have been very difficult to write without upc_addrfield, 
but as written the test may not be portable given the underspecification of 
upc_addrfield.

I'll leave it to UPC historians to explain exactly why upc_addrfield exists, 
but I suspect when it was introduced no one wanted the alternatives -- to add 
something to the UPC spec that required implementers to modify their libc 
printf to support a %P (or whatever letter) for a pointer-to-shared or to 
provide a upc_printf wrapper around printf.  I also suspect it felt odd to have 
upc_threadof and upc_phaseof but not have any mechanism to access the value of 
the remaining field; however the address is implementation-specific information 
whereas the thread or phase are not.

We (too) frequently have users attempt this code:

    shared int* ptr;
    ...
    printf( "%p", ptr );

which does not do what they expect (on our current systems, but used to work on 
older Crays).  The conforming alternative, given that upc_threadof and 
upc_phaseof return a size_t for which there is no format specifier, is the 
verbose:

    printf( "%lu %lu %p", (unsigned long)upc_threadof( ptr ), (unsigned long)upc_phaseof( ptr ),
            upc_addrfield( ptr ) );

I'm sure they'd much rather be able to do this:

    upc_printf( "%P", ptr );

and I think that would cover all user code that I've seen for upc_addrfield, 
but the output format would need to be implementation-defined and the printed 
address value would be just as unspecified as the current upc_addrfield return 
value.  I'm not sure it would be an improvement for implementers given that 
there would be a whole family of printf functions to wrap, but it may be an 
improvement for users.

Original comment by johnson....@gmail.com on 27 Feb 2013 at 5:00

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Responding to this item from comment #8:
> given that upc_threadof and upc_phaseof return a size_t for which there is no 
format specifier,

Sure there is: the "z" length modifier (e.g. "%zd", "%zu", "%zx", etc) which 
was added in C99.

And Yes, I agree that extending the printf formats to have one for PTS would be 
nice, but it may be hard for some vendors.  Berkeley, for instance, is at the 
mercy of somebody else's libc. Only when using glibc do we have 
register_printf_function().

Original comment by phhargr...@lbl.gov on 27 Feb 2013 at 6:54

GoogleCodeExporter commented 9 years ago
upc_addrfield was devised when the expectation was that a UPC implementation of 
a pointer-to-shared actually would be a structure with three fields for thread, 
phase, and addr (and possibly other fields).  Thus, while addr was assumed not 
to be an actual address, it was assumed to act like an address and, more 
important, be used in a manner similar to an address in the actual 
implementation.  The behavior of upc_addrfield was not so much a description of 
the values returned by a function, but rather a set of directives for how to 
implement pointers-to-shared and how to do pointer arithmetic, behavior that 
was verified by extracting the values out of the structure using the 
decomposition functions.

When the conception of a pointer-to-shared was broadened and the definitions of 
the pointer decomposition functions loosened to allow them to be calculated 
rather than extractions of actual portions of the representation of the pointer 
value, upc_addrfield lost its usefulness.

Original comment by brian.wibecan on 27 Feb 2013 at 7:03

GoogleCodeExporter commented 9 years ago
While I never tried it, I have always imagined that if I was going to hash a 
PTS value, that I would use upc_addrfield().  If we remove this function, what 
is the recommended way to generate hashes from a PTS?

We can't cast to int (or long, uintptr_t, etc) because we agreed in issue #11 
that this has implementation-defined behavior (might always be 0xdeadbeef), and 
casts to local pointers are only defined when performed by the thread with 
affinity.

So, unless somebody can point out an alternative way to get "bits" suitable for 
generating hash values from a PTS, I vote to KEEP upc_addrfield (adding any 
necessary constraints such as a value that is independent of calling thread).

Original comment by phhargr...@lbl.gov on 27 Feb 2013 at 7:04

GoogleCodeExporter commented 9 years ago
Ah, I missed the 'z' option in the man page.  Thanks for pointing that out.

Hashing without upc_addrfield is not a problem.  Many hash functions operate by 
churning through each byte of data in turn, so one could apply that idea to a 
PTS via:

    shared int* p;
    long my_hash( char* bytes, size_t nbytes );

    long hash_value = my_hash( (char*)&p, sizeof( p ) );

Original comment by johnson....@gmail.com on 27 Feb 2013 at 7:07

GoogleCodeExporter commented 9 years ago
"Hashing without upc_addrfield is not a problem.  Many hash functions operate 
by churning through each byte of data in turn, so one could apply that idea to 
a PTS"

That assumes that two different threads or two different contexts have the same 
data representation of a pointer value.  I don't believe that's required.  I 
think Paul is suggesting a hash mechanism that uses information guaranteed to 
be the same for the same pointer value in all contexts.

Original comment by brian.wibecan on 27 Feb 2013 at 7:26

GoogleCodeExporter commented 9 years ago
Troy wrote in comment #14:
> Hashing without upc_addrfield is not a problem.  Many hash functions operate 
by
> churning through each byte of data in turn[....]

We all agreed (issue #73) that phase is ignored in pointer comparison. So, 
hashing on the bytes that comprise the PTS representation might not yield the 
same hash for two pointers to the same object (which is one of the properties I 
would probably require).

ADDITIONALLY, if the PTS representation has any "padding" that could vary 
between two pointers to the same object this byte-wise hashing would still fail 
the uniqueness constraint.  I believe that possibility eliminates the 
possibility of using upc_resetphase() to get around the objection I have in the 
previous paragraph.  The end-user would have no mechanism to "mask" the padding 
since they (should) know nothing of the internal representation.

Original comment by phhargr...@lbl.gov on 27 Feb 2013 at 7:27

GoogleCodeExporter commented 9 years ago
While it is on my mind, here my thoughts on upc_addrfield() if it is to be 
useful for something like generating hashes.  I believe it also provides the 
properties desirable for debugging (no "false positives" from properly applied 
tests for equality) though that property is debatable (noted below)/

Properties I *do* expect:
+ Caller-independent
  For a given PTS p0, the value of upc_addrfield(p0) should not depend
  on the calling thread
+ Object-specific
  Given two PTS p1 and p2 that compare equal, their upc_addrfield() must
  also be equal.  Using "->" as "implies":
     (p1 == p2) -> (upc_addrfield(p1) == upc_addrfield(p2))
+ Byte-oriented within an object
  Given two PTS, p3 and p4, to portions of a SINGLE object with affinity
  to the SAME thread, the following is true on the thread with affinity:
    (upc_addrfield(p3) - upc_addrfield(p4)) == ((uint8_t*)(p3) - (uint8_t*)(p4))
  [except that size_t vs. ptrdiff_t might need a cast?]
+ Per-thread uniqueness (debatable: constrains per-thread heap to SIZE_MAX)
  Given PTS p5 and p6 with affinity to the SAME thread, the upc_addfield()
  values are equal ONLY IF p5 and p6 compare equal:
    (upc_addrfield(p5) == upc_addrfield(p6)) &&
            (upc_threadof(p5) == upc_threadof(p6))  ->  (p5 == p6)

These are "natural" for all reasonable PTS representations, including those 
that may encode "segment:offset" as bitfields in the addrfield (same object 
must have same segment, but distinct objects could have distinct segments) in 
which there is no single "flat" or "linear" space for the shared heap of a 
given thread.

Properties I do NOT expect:
+ Global uniqueness
  Equality of upc_addrfield() does NOT imply anything when their threads differ
+ Constraint on PTS to distinct objects
  Other than per-thread uniqueness, there is no constraint on the upc_addrfield()
  values for two PTS with the same (or different) affinity.  This is just the
  same as C pointer subtraction using pointers to different objects - undefined.
  This is a requirement to permit the "segment:offset" sort of encoding.

A note on the "debatable" point:
The "per-thread uniqueness" property is a desirable property for debugging, but 
I don't think it is essential to other applications such as hashing.  This 
property would not normally be a problem, except that upc_addrfield() returns 
size_t.  In the C99 spec it is clear that size_t is intended to be large enough 
to encode the size of a single OBJECT, but nowhere is it stated that the size 
of the entire "heap" must fit in size_t.  IN FACT, NEC (or is that Cray?) SX-6 
had 64-bit pointers but 32-bit size_t.  So, for the RARE platform that had a 
"narrow" size_t the per-thread uniqueness of upc_addrfield() would constrain 
the per-thread shared heap size to SIZE_MAX.  I really doubt that is a problem 
for anyone (and vaguely recall some other issue in the tracker dismissing such 
a concern), but wanted to note this for completeness.

Original comment by phhargr...@lbl.gov on 27 Feb 2013 at 10:21

GoogleCodeExporter commented 9 years ago
I am changing the description to reflect the current lack of consensus.

Until/unless somebody can offer a portable hash function on PTS (without use of 
addrfield) that has at least the first 2 properties in my previous comment 
(caller-independent and object-specific), I am going to remain in "strengthen" 
camp.

Original comment by phhargr...@lbl.gov on 27 Feb 2013 at 11:04

GoogleCodeExporter commented 9 years ago
For the hash function, could you cast the PTS to a shared [] char* and then 
subtract a NULL pointer from it to get an address value?  I have not tried this 
yet but just thought of it.

Original comment by johnson....@gmail.com on 28 Feb 2013 at 2:32

GoogleCodeExporter commented 9 years ago
Troy suggests:
> could you cast the PTS to a shared [] char* and then subtract
> a NULL pointer from it to get an address value? 

My initial response is that this suggestion runs afoul of the undefined nature 
of PTS subtraction when the pointers are not part of the same object.  However, 
it is the best alternative I've heard so far.

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 2:35

GoogleCodeExporter commented 9 years ago
> My initial response is that this suggestion runs afoul of the undefined
> nature of PTS subtraction when the pointers are not part of the same object.

Right, C pointer subtraction rules would make it undefined, so the UPC spec 
would need to define it if we wanted to make that strategy portable.  I tried 
it with Cray UPC and these two printfs produce the same output:

#include <stdio.h>
#include "upc.h"

shared int x;

int main( int argc, char* argv[] )
{
    shared int* p = &x;

    printf( "%zu\n", upc_addrfield( p ) );

    printf( "%tu\n", (shared [] char*)p - (shared [] char*)NULL );

    return 0;
}

If someone has an implementation for which these printfs would produce 
different output, I'd be interested in hearing why.

Original comment by johnson....@gmail.com on 28 Feb 2013 at 6:36

GoogleCodeExporter commented 9 years ago
>    printf( "%tu\n", (shared [] char*)p - (shared [] char*)NULL );
>
> If someone has an implementation for which these printfs would produce 
different 
> output, I'd be interested in hearing why.

In addition to violating C99's rule about pointer subtraction between pointers 
to separate objects, this also runs afoul of the UPC PTS subtraction semantics:

  the result is undefined unless there exists an integer x, representable
  as a ptrdiff t, such that (pts1 + x) == pts2 AND upc phaseof(pts1 +
  x) == upc phaseof(pts2). In this case (pts2 - pts1) evaluates to x.

When *p has affinity to a non-zero thread, there may not be a value of x that 
you can add to NULL to get p. An implementation would be well within its rights 
to detect this as a fatal error. Barring that, the result of the subtraction 
would depend on the implementation-specific details of PTS NULL and the pointer 
representation.

Original comment by danbonachea on 28 Feb 2013 at 8:47

GoogleCodeExporter commented 9 years ago
Dan, everything you wrote in Comment #22 is true, but I'm asking if any 
implementation actually does treat this as a fatal error, not if they could in 
theory.  Paul's looking for some way to write a hash function without 
upc_addrfield().  If we were to deprecate upc_addrfield() but define PTS - NULL 
for indefinite block size, that would leave a mechanism for writing his hash 
function.  If any current implementation treats PTS - NULL as a fatal error, 
then it's a bit harder for us to make a change that defines PTS - NULL.  I'm 
simply trying to gauge how such a change would affect current implementations.

Defining addition pointer subtraction semantics is not without precedent -- 
C++03 defines a null pointer minus a null pointer to be 0 for example, whereas 
C leaves it undefined.  UPC could similarly define PTS - NULL.  (We could also 
define a null PTS minus a null PTS to be 0 too, if we had reason.)

Original comment by johnson....@gmail.com on 28 Feb 2013 at 3:53

GoogleCodeExporter commented 9 years ago
> I'm asking if any implementation actually does treat this as a fatal error, 
not if they could in theory. 

BUPC DOES detect such a subtraction as a fatal error at runtime in debug mode, 
when the pointer argument has non-zero affinity. There are two possible errors 
based on how exactly the test is written and constant-folded in the frontend:

UPC Runtime error: Illegal: subtraction of indefinite pointers with affinity to 
different UPC threads (7 and 0)
 at: issue107e_obj.c:78

UPC Runtime error: Assertion failure at _upcr_pshared_to_local() at 
/usr/common/ftg/upc/2.16.0/bupc-2.16.0-icc-12.1.3/runtime/inst/dbg/include/upcr_
sptr.h:689: upcr_isnull_pshared(sptr) || upcr_threadof_pshared(sptr) == 
upcr_mythread()
 at: issue107d_obj.c:76

Paul's usage case of PTS hashing has changed my mind on this issue - I think 
generating an appropriate hash for a PTS is sufficiently problematic and 
dependent on implementation-details that it should probably be encapsulated as 
a library function to allow portable hashing. Strengthening upc_addrfield() to 
fill that need (as suggested in comment #17) seems reasonable and in keeping 
with the "spirit" of the original (ie "an implementation-defined value 
reflecting the “local address” of the object"), and also compatible with 
the other known usage case of printing PTS values for debugging.

Original comment by danbonachea on 28 Feb 2013 at 4:34

GoogleCodeExporter commented 9 years ago
Thanks Dan, the BUPC example answers my question.

A library function for generating a PTS hash may be problematic for two 
reasons.  Such a function would need to accept a shared void*, so if one wanted 
to hash based on all three parts of a PTS, not just the thread and addrfield, 
it's not possible because the phase is thrown away.  Second, and probably more 
importantly, hash functions are chosen carefully for performance reasons 
depending on what a specific application needs.  Trying to provide a 
one-size-fits-all hash function for everyone via a upc_hash( shared void* ) 
function may be pointless because everyone will ignore it and write their own.

Strengthening upc_addrfield() enough to allow people to write their own hash 
function sounds better to me.

Original comment by johnson....@gmail.com on 28 Feb 2013 at 5:14

GoogleCodeExporter commented 9 years ago
Ignore my first reason in Comment #25.  Steve corrected me that shared void* 
preserves phase; I had forgotten that.

Original comment by johnson....@gmail.com on 28 Feb 2013 at 5:31

GoogleCodeExporter commented 9 years ago
>  Trying to provide a one-size-fits-all hash function for everyone via a 
upc_hash( 
> shared void* ) function may be pointless because everyone will ignore it and 
write 
> their own.

Apologies, my comment #24 was a bit unclear. What I meant was:
 "Generating an appropriate (value to be used in computing) a hash for a PTS is sufficiently problematic and dependent on implementation-details that it should probably be encapsulated as a library function to allow portable hashing."

I think Paul was not suggesting to write the actual hashing function (ie the 
one that maps a data value to a bucket) but merely the utility that provides an 
appropriately unique integral value representing the object address, with the 
guarantees he suggests. There is currently no portable way to get such a value 
from a PTS, which implies there is no portable way to hash a PTS. The user is 
the one who hashes the return value to compute an actual bucket, and can choose 
to fold in additional information such as the result of 
upc_phaseof()/upc_threadof() as seems fit for his application.

> Strengthening upc_addrfield() enough to allow people to write their own hash 
function sounds better to me.

I'm pretty sure that's what Paul is proposing in comment #17.

Original comment by danbonachea on 28 Feb 2013 at 5:40

GoogleCodeExporter commented 9 years ago
>> Strengthening upc_addrfield() enough to allow people to write their
>> own hash function sounds better to me.
> 
> I'm pretty sure that's what Paul is proposing in comment #17.

Exactly. I don't want a upc_hash_pts(), I just want upc_addrfield() to become 
well-defined enough that one can do something like the following:

  h = my_hash(upc_addrfield(p)) // good if p has known affinity
  h = my_hash(upc_addrfield(p) ^ upc_threadof(p))  // better when affinity may vary

Additionally, upc_addrfield() would retain its usefulness for debugging (and 
writing of spec conformance tests).

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 6:37

GoogleCodeExporter commented 9 years ago
I propose the following change.  I believe it strengthens upc_addrfield() 
sufficiently to satisfy all of Paul's concerns.

Index: lang/upc-lib-core.tex
===================================================================
--- lang/upc-lib-core.tex       (revision 205)
+++ lang/upc-lib-core.tex       (working copy)
@@ -302,11 +302,27 @@

 \np The {\tt upc\_addrfield} function returns an
    implementation-defined value reflecting the ``local address'' of the
-   object pointed to by the pointer-to-shared argument.\footnote{%
-   This function is used in defining the semantics of pointer-to-shared
-   arithmetic in Section \ref{pointer-arithmetic}}
+   object pointed to by the pointer-to-shared argument.
+
+\np \xadded[id=SV]{107}{The return value of {\tt upc\_addrfield} shall be
+   independent of the calling thread.}

-
+\np \xadded[id=SV]{107}{If two pointers-to-shared compare equal, then they
+   shall produce the same value when passed to {\tt upc\_addrfield}.  If two
+   pointers-to-shared do not compare equal, they may still produce the same
+   value when passed to {\tt upc\_addrfield}, but only if they point to shared
+   objects with affinity to different threads.}
+
+\np \xadded[id=SV]{107}{If two pointers-to-shared {\tt S1} and {\tt S2} point
+   to distinct bytes of the same shared object which have affinity to the same
+   thread, then the expression\\*
+   {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t) upc\_addrfield(S1))}\\*
+   shall evaluate on all threads to the same value that\\*
+   {\tt ((char *)S2 - (char *)S1)}\\*
+   evalutes to on the thread with which the pointed-to bytes have affinity.}
+
+\xchangenote[id=SV]{107}{Constraints moved from 6.4.2 and cross-reference 
footnote removed.}
+
 \paragraph{The {\tt upc\_affinitysize} function}

 {\bf Synopsis}
Index: lang/upc-language.tex
===================================================================
--- lang/upc-language.tex       (revision 205)
+++ lang/upc-language.tex       (working copy)
@@ -303,9 +303,8 @@
 \begin{itemize}
 \item S1 and P1 shall point to the same object.
 \item S2 and P2 shall point to the same object.
-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt 
(ptrdiff\_t) upc\_addrfield(S1))} shall
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).
 \end{itemize}
+\xchangenote[id=SV]{107}{Constraint on {\tt upc\_addrfield} strengthened and 
moved to Section~\ref{upc_addrfield}.}

 \np Two compatible pointers-to-shared which point to the same
     object (i.e. having the same address and thread components) shall

Original comment by sdvor...@cray.com on 28 Feb 2013 at 7:25

GoogleCodeExporter commented 9 years ago
Comment on Paul's list in #17:

> Properties I do NOT expect:
> + Global uniqueness
>   Equality of upc_addrfield() does NOT imply anything when their threads 
differ

Huh.  I would have sworn it was required that, for a declaration like

  shared T A[THREADS];

that upc_addrfield(&A[i]) would yield the same value for all 0 <= i < THREADS.  
But no, it's not there.  That makes upc_addrfield even less useful.  It was 
certainly the original expectation.

Original comment by brian.wibecan on 28 Feb 2013 at 7:39

GoogleCodeExporter commented 9 years ago
Brian,

While your stated/desired property in comment #30 is true for MANY 
implementations of statically declared variables it is STILL incorrect to infer 
that (upc_addrfield(p) == upc_addrfield(q)) implies that p and q point to the 
same array.  For instance, if p and q were dynamically allocated by 
upc_alloc().  Thus even if your stated property was true, my statement about no 
"global uniqueness" would remain true.

Currently, nothing constrains implementations to have a "symmetric heap" 
property for static allocations.  This property WAS the case for the T3E, and 
is likely to still be true of most implementations because it is just simpler 
to deal with.  George or Yili may correct me if I have this wrong, but I 
believe the BG/L implementation of UPC lacked a symmetric heap and communicated 
"on the wire" in terms of some sort of GUID (or handle) and (since that was not 
a RDMA network) the code at the target node would perform 
translation/table-lookup, performing lazy allocation if the handle had not been 
seen before.

Note that an implementation could have a symmetric heap and still not satisfy 
your expectation if the heaps are offset across nodes and upc_addrfield() 
returns the full virtual address (though that could be "fixed" by returning the 
offset-within-heap).

So, YES, upc_addrfield() is less useful than some may have thought.

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 7:59

GoogleCodeExporter commented 9 years ago
Brian, the requirement that upc_addrfield( &A[i] ) yield the same value for all 
threads for shared T A[THREADS] would be too strong.  If you have UPC purely 
inside a shared memory environment, the entire array could be laid out in the 
same virtual memory space such that every A[i] is at a different address and 
the implementation could be using those locations for the result of 
upc_addrfield().  There are also non shared memory systems unfortunate enough 
to have had address layout randomization enabled such that true symmetric 
addressing is made quite painful.

Original comment by johnson....@gmail.com on 28 Feb 2013 at 8:04

GoogleCodeExporter commented 9 years ago
I endorse the intent and scope of Steve's proposed change in comment #29, but 
anticipate we'll want to refine the wording slightly.  I think Dan is best 
qualified to take a shot at that.

Since Steve mostly reworded my "prose", I am "too close" to the text to be an 
impartial editor.  However, I'll make a few comments:

+ The constraint beginning "If two pointers-to-shared compare equal" should 
probably split in two, making one point at a time to make each one stronger.

+ Assuming the split suggested in my previous bullet, the "If two 
pointers-to-shared do not compare equal" half probably needs rewording - it 
just feels "clumsy" to me as written now.  I would avoid phrases like "but only 
if" which can be hard to understand, when we have the formal "implies" at our 
disposal.  The following is a "forward" version:
   ((p != q) && (upc_addrfield(p) == upc_addrfield(q)))
                         -> (upc_threadof(p) != upc_threadof(q))    

+ "distinct bytes" doesn't seem necessary in the last constraint, as the 
expressions both evaluate to zero if the PTS reference the same byte.

I am "neutral" on the question of 1.3 vs. 1.4 except the the extent that I 
think resolving issues 106 and 107 in the SAME spec seems likely to produce the 
best results.

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 8:19

GoogleCodeExporter commented 9 years ago
I just noticed the change to 6.4.2 made by the diff in comment #29.
Such a change leaves constraint paragraphs 4 and 5 of 6.4.2 essentially 
pointless.
They become all "setup" with no "punchline", because the remaining statements 
that taken pairwise (P1,S1) and (P2,S2) address the same objects is not of any 
consequence in defining address arithmetic.

Steve, is it correct to assume you "yield" to Dan's change at the same location 
made by comment #61 of issue 106 (below)?

-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt 
(ptrdiff\_t) upc\_addrfield(S1))} shall 
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).  
+\xadded[id=DB]{106}{
+\item The expression {\tt P1 + (S2 - S1) == P2} shall evaluate to 1.%
+\truefootnote{This implies there is no padding inserted between blocks of 
shared array elements with affinity to a thread.}
+}

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 8:39

GoogleCodeExporter commented 9 years ago
> Steve, is it correct to assume you "yield" to Dan's change at the same 
location made by comment #61 of issue 106 (below)?

Yes.  I actually like Dan's change--I just think that it is incomplete without 
also fixing upc_addrfield().

An updated proposal taking into account Paul's comments in comment 33.

Index: lang/upc-lib-core.tex
===================================================================
--- lang/upc-lib-core.tex       (revision 205)
+++ lang/upc-lib-core.tex       (working copy)
@@ -302,11 +302,30 @@

 \np The {\tt upc\_addrfield} function returns an
    implementation-defined value reflecting the ``local address'' of the
-   object pointed to by the pointer-to-shared argument.\footnote{%
-   This function is used in defining the semantics of pointer-to-shared
-   arithmetic in Section \ref{pointer-arithmetic}}
+   object pointed to by the pointer-to-shared argument.
+
+\np \xadded[id=SV]{107}{The return value of {\tt upc\_addrfield} shall be
+   independent of the calling thread.}

-
+\np \xadded[id=SV]{107}{If two pointers-to-shared point to bytes with affinity
+   to the same thread, then the values resulting from passing them to
+   {\tt upc\_addrfield} shall compare equal if and only if the pointers
+   themselves compare equal.}
+
+\np \xadded[id=SV]{107}{If two pointers-to-shared {\tt S1} and {\tt S2} point
+   to bytes with affinity to the same thread that are part of a single shared
+   object, then the expression\\*
+   {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t) upc\_addrfield(S1))}\\*
+   shall evaluate on all threads to the same value that\\*
+   {\tt ((char *)S2 - (char *)S1)}\\*
+   evalutes to on the thread with which the pointed-to bytes have affinity.}
+
+\np \xadded[id=SV]{107}{If two pointers-to-shared point to bytes with affinity
+   to different threads, then the result of comparing the values resulting
+   from passing them to {\tt upc\_addrfield} is undefined.}
+
+\xchangenote[id=SV]{107}{Constraints moved from 6.4.2 and cross-reference 
footnote removed.}
+
 \paragraph{The {\tt upc\_affinitysize} function}

 {\bf Synopsis}
Index: lang/upc-language.tex
===================================================================
--- lang/upc-language.tex       (revision 205)
+++ lang/upc-language.tex       (working copy)
@@ -303,9 +303,8 @@
 \begin{itemize}
 \item S1 and P1 shall point to the same object.
 \item S2 and P2 shall point to the same object.
-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt 
(ptrdiff\_t) upc\_addrfield(S1))} shall
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).
 \end{itemize}
+\xchangenote[id=SV]{107}{Constraint on {\tt upc\_addrfield} strengthened and 
moved to Section~\ref{upc_addrfield}.}

 \np Two compatible pointers-to-shared which point to the same
     object (i.e. having the same address and thread components) shall

Original comment by sdvor...@cray.com on 28 Feb 2013 at 8:44

GoogleCodeExporter commented 9 years ago
+\np \xadded[id=SV]{107}{If two pointers-to-shared compare equal, then they
+   shall produce the same value when passed to {\tt upc\_addrfield}.  If two
+   pointers-to-shared do not compare equal, they may still produce the same
+   value when passed to {\tt upc\_addrfield}, but only if they point to shared
+   objects with affinity to different threads.}

The last clause of this paragraph specifies the "debatable" property from 
Paul's comment #17. I find that constraint unacceptable for the same reasons 
Paul listed - specifically, C99 does not guarantee size_t is large enough to 
contain a local address value (nor is it ever used for that purpose in C99 - 
size_t is for object sizes, (u)intptr_t are the integral types for containing 
address values). The function return type of size_t is not sufficient to 
guarantee every pointer-to-shared can be uniquely represented on every system, 
even when only considering pointers to objects with affinity a single thread. 
Consequently such a library specification could undesirably constrain the 
available shared memory on some systems.

As Paul mentioned, the hashing usage case does not require this property. 
However if we want the property for other usage cases, then I think we must 
change the return type of upc_addrfield to ensure the result type is "wide 
enough" to uniquely represent any local pointer. Note that ptrdiff_t is also 
not guaranteed to be wide enough for this purpose (in fact C99 allows it to be 
smaller than size_t!). The most natural choice for a "wide-enough" return type 
would be uintptr_t or intptr_t - unfortunately these are designated as an 
OPTIONAL type in C99 (and also C11). To my knowledge these types are available 
on all systems of interest, as the architectural hardware to support them is 
usually the same as that used to support pointer arithmetic. UPC already 
requires a great many additional types and features relative to C99, so making 
that optional type mandatory for a compliant UPC implementation does not seem 
overly burdensome to implementers. A different concern is the effect on 
existing clients of upc_addrfield of changing the return value - although if we 
truly believe that's currently a small set then perhaps that minor break in 
backwards compatibility is acceptable.

Of course the other option to changing the return type is to weaken the 
constraint and allow "false duplicates" in the return value.

Original comment by danbonachea on 28 Feb 2013 at 8:46

GoogleCodeExporter commented 9 years ago
Our comments passed in the ether. My comment #36 applies equally to the new 
proposed paragraph:

+\np \xadded[id=SV]{107}{If two pointers-to-shared point to bytes with affinity
+   to the same thread, then the values resulting from passing them to
+   {\tt upc\_addrfield} shall compare equal if and only if the pointers
+   themselves compare equal.}

Original comment by danbonachea on 28 Feb 2013 at 8:49

GoogleCodeExporter commented 9 years ago
I also object to this proposed text:

+\np \xadded[id=SV]{107}{If two pointers-to-shared point to bytes with affinity
+   to different threads, then the result of comparing the values resulting
+   from passing them to {\tt upc\_addrfield} is undefined.}

The return value of the function is already "implementation-defined" by default 
in paragraph 1. Subsequent paragraphs in the proposal further constrain this 
value. But the constraint above essentially implies that a comparison operator 
using two values returned by the function can have undefined behavior (ie crash 
the program, launch the nukes), which is way too strong. I believe it's 
sufficient to delete this paragraph and leave the "implementation-defined" 
default from paragraph 1 in force for that case. If we feel it's necessary to 
amplify this point, it should be reformulated as "unspecified behavior", and 
written in a generic way that is not specific to a particular arithmetic 
operator.

Original comment by danbonachea on 28 Feb 2013 at 9:00

GoogleCodeExporter commented 9 years ago
How about the following, borrowing from the C99 definition of [u]intptr_t:

Index: lang/upc-lib-core.tex
===================================================================
--- lang/upc-lib-core.tex       (revision 205)
+++ lang/upc-lib-core.tex       (working copy)
@@ -302,11 +302,37 @@

 \np The {\tt upc\_addrfield} function returns an
    implementation-defined value reflecting the ``local address'' of the
-   object pointed to by the pointer-to-shared argument.\footnote{%
-   This function is used in defining the semantics of pointer-to-shared
-   arithmetic in Section \ref{pointer-arithmetic}}
+   object pointed to by the pointer-to-shared argument.
+
+\np \xadded[id=SV]{107}{The return value of {\tt upc\_addrfield} shall be
+   independent of the calling thread.}

-
+\np \xadded[id=SV]{107}{If two pointers-to-shared point to bytes with affinity
+   to the same thread, then the values resulting from passing them to
+   {\tt upc\_addrfield} shall compare equal if the pointers themselves compare
+   equal.}
+
+\np \xadded[id=SV]{107}{If {\tt size\_t} has the property that any valid
+   pointer to {\tt void} can be converted to it, then converted back and
+   compare equal to the original pointer, then the results of passing
+   two pointers-to-shared that point to bytes with affinity to the same thread
+   to {\tt upc\_addrfield} shall compare equal only if the pointers themselves
+   compare equal.}
+
+\np \xadded[id=SV]{107}{If two pointers-to-shared {\tt S1} and {\tt S2} point
+   to bytes with affinity to the same thread that are part of a single shared
+   object, then the expression\\*
+   {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t) upc\_addrfield(S1))}\\*
+   shall evaluate on all threads to the same value that\\*
+   {\tt ((char *)S2 - (char *)S1)}\\*
+   evalutes to on the thread with which the pointed-to bytes have affinity.}
+
+\np \xadded[id=SV]{107}{If two pointers-to-shared point to bytes with affinity
+   to different threads, then the result of comparing the values resulting
+   from passing them to {\tt upc\_addrfield} is undefined.}
+
+\xchangenote[id=SV]{107}{Constraints moved from 6.4.2 and cross-reference 
footnote removed.}
+
 \paragraph{The {\tt upc\_affinitysize} function}

 {\bf Synopsis}
Index: lang/upc-language.tex
===================================================================
--- lang/upc-language.tex       (revision 205)
+++ lang/upc-language.tex       (working copy)
@@ -303,9 +303,8 @@
 \begin{itemize}
 \item S1 and P1 shall point to the same object.
 \item S2 and P2 shall point to the same object.
-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt 
(ptrdiff\_t) upc\_addrfield(S1))} shall
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).
 \end{itemize}
+\xchangenote[id=SV]{107}{Constraint on {\tt upc\_addrfield} strengthened and 
moved to Section~\ref{upc_addrfield}.}

 \np Two compatible pointers-to-shared which point to the same
     object (i.e. having the same address and thread components) shall

Original comment by sdvor...@cray.com on 28 Feb 2013 at 9:03

GoogleCodeExporter commented 9 years ago
Re: Dan's argument against per-thread uniqueness.

Since Steve's agreement to the current proposal for issue #106 is conditional 
on resolving this issue (#107), I'd like to propose a two-stage approach:

1) To publish 1.3 without a lengthy debate on the "per-thread uniqueness" 
property, I suggest that Steve adjust the proposed text to EXCLUDE this 
property.

2) That for 1.4 we engage in the debate that will include the possibility of 
widening the return type to allow this property to be satisfied (FWIW, I don't 
see how one could implement a UPC runtime without having a [u]intptr_t in the 
base C compiler).

Unless I am missing something, adding this constraint in 1.4 cannot break 
anything EXCEPT issues related (directly or indirectly) to the return type.  If 
that is incorrect, then we'll have to hash this out sooner rather than later.

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 9:04

GoogleCodeExporter commented 9 years ago
> The return value of the function is already "implementation-defined" by 
default in paragraph 1. Subsequent paragraphs in the proposal further constrain 
this value. But the constraint above essentially implies that a comparison 
operator using two values returned by the function can have undefined behavior 
(ie crash the program, launch the nukes), which is way too strong. I believe 
it's sufficient to delete this paragraph and leave the "implementation-defined" 
default from paragraph 1 in force for that case. If we feel it's necessary to 
amplify this point, it should be reformulated as "unspecified behavior", and 
written in a generic way that is not specific to a particular arithmetic 
operator.

This is true.  I'll remove it in the next diff.

Original comment by sdvor...@cray.com on 28 Feb 2013 at 9:07

GoogleCodeExporter commented 9 years ago
Removing the paragraph about undefined results:

Index: lang/upc-lib-core.tex
===================================================================
--- lang/upc-lib-core.tex       (revision 205)
+++ lang/upc-lib-core.tex       (working copy)
@@ -302,11 +302,33 @@

 \np The {\tt upc\_addrfield} function returns an
    implementation-defined value reflecting the ``local address'' of the
-   object pointed to by the pointer-to-shared argument.\footnote{%
-   This function is used in defining the semantics of pointer-to-shared
-   arithmetic in Section \ref{pointer-arithmetic}}
+   object pointed to by the pointer-to-shared argument.
+
+\np \xadded[id=SV]{107}{The return value of {\tt upc\_addrfield} shall be
+   independent of the calling thread.}

-
+\np \xadded[id=SV]{107}{If two pointers-to-shared point to bytes with affinity
+   to the same thread, then the values resulting from passing them to
+   {\tt upc\_addrfield} shall compare equal if the pointers themselves compare
+   equal.}
+
+\np \xadded[id=SV]{107}{If {\tt size\_t} has the property that any valid
+   pointer to {\tt void} can be converted to it, then converted back and
+   compare equal to the original pointer, then the results of passing
+   two pointers-to-shared that point to bytes with affinity to the same thread
+   to {\tt upc\_addrfield} shall compare equal only if the pointers themselves
+   compare equal.}
+
+\np \xadded[id=SV]{107}{If two pointers-to-shared {\tt S1} and {\tt S2} point
+   to bytes with affinity to the same thread that are part of a single shared
+   object, then the expression\\*
+   {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t) upc\_addrfield(S1))}\\*
+   shall evaluate on all threads to the same value that\\*
+   {\tt ((char *)S2 - (char *)S1)}\\*
+   evalutes to on the thread with which the pointed-to bytes have affinity.}
+
+\xchangenote[id=SV]{107}{Constraints moved from 6.4.2 and cross-reference 
footnote removed.}
+
 \paragraph{The {\tt upc\_affinitysize} function}

 {\bf Synopsis}
Index: lang/upc-language.tex
===================================================================
--- lang/upc-language.tex       (revision 205)
+++ lang/upc-language.tex       (working copy)
@@ -303,9 +303,8 @@
 \begin{itemize}
 \item S1 and P1 shall point to the same object.
 \item S2 and P2 shall point to the same object.
-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt 
(ptrdiff\_t) upc\_addrfield(S1))} shall
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).
 \end{itemize}
+\xchangenote[id=SV]{107}{Constraint on {\tt upc\_addrfield} strengthened and 
moved to Section~\ref{upc_addrfield}.}

 \np Two compatible pointers-to-shared which point to the same
     object (i.e. having the same address and thread components) shall

Original comment by sdvor...@cray.com on 28 Feb 2013 at 9:08

GoogleCodeExporter commented 9 years ago
Regarding the following from comment #39
> If two pointers-to-shared point to bytes with affinity
> to the same thread, then the values resulting from passing them to
> {\tt upc\_addrfield} shall compare equal if the pointers themselves
> compare equal.

PTS equality IMPLIES affinity to the same thread, making the first clause 
redundant.
How about:

If two pointers-to-shared compare equal, then the values resulting from passing 
them to {\tt upc\_addrfield} shall compare equal.

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 9:10

GoogleCodeExporter commented 9 years ago
> Our comments passed in the ether. My comment #36 applies equally to the new 
proposed paragraph
> ...

Does the new language address these concerns?  I think it still provides the 
uniqueness guarantee for all platforms where size_t is "wide enough", while 
permitting implementations on platforms where that is not true.

Original comment by sdvor...@cray.com on 28 Feb 2013 at 9:13

GoogleCodeExporter commented 9 years ago
Updated with Paul's comment from comment 43.

Index: lang/upc-lib-core.tex
===================================================================
--- lang/upc-lib-core.tex       (revision 205)
+++ lang/upc-lib-core.tex       (working copy)
@@ -302,11 +302,32 @@

 \np The {\tt upc\_addrfield} function returns an
    implementation-defined value reflecting the ``local address'' of the
-   object pointed to by the pointer-to-shared argument.\footnote{%
-   This function is used in defining the semantics of pointer-to-shared
-   arithmetic in Section \ref{pointer-arithmetic}}
+   object pointed to by the pointer-to-shared argument.
+
+\np \xadded[id=SV]{107}{The return value of {\tt upc\_addrfield} shall be
+   independent of the calling thread.}

-
+\np \xadded[id=SV]{107}{If two pointers-to-shared compare equal, then
+   the values resulting from passing them to {\tt upc\_addrfield} shall
+   compare equal.}
+
+\np \xadded[id=SV]{107}{If {\tt size\_t} has the property that any valid
+   pointer to {\tt void} can be converted to it, then converted back and
+   compare equal to the original pointer, then the results of passing
+   two pointers-to-shared that point to bytes with affinity to the same thread
+   to {\tt upc\_addrfield} shall compare equal only if the pointers themselves
+   compare equal.}
+
+\np \xadded[id=SV]{107}{If two pointers-to-shared {\tt S1} and {\tt S2} point
+   to bytes with affinity to the same thread that are part of a single shared
+   object, then the expression\\*
+   {\tt ((ptrdiff\_t) upc\_addrfield(S2) - (ptrdiff\_t) upc\_addrfield(S1))}\\*
+   shall evaluate on all threads to the same value that\\*
+   {\tt ((char *)S2 - (char *)S1)}\\*
+   evalutes to on the thread with which the pointed-to bytes have affinity.}
+
+\xchangenote[id=SV]{107}{Constraints moved from 6.4.2 and cross-reference 
footnote removed.}
+
 \paragraph{The {\tt upc\_affinitysize} function}

 {\bf Synopsis}
Index: lang/upc-language.tex
===================================================================
--- lang/upc-language.tex       (revision 205)
+++ lang/upc-language.tex       (working copy)
@@ -303,9 +303,8 @@
 \begin{itemize}
 \item S1 and P1 shall point to the same object.
 \item S2 and P2 shall point to the same object.
-\item The expression (({\tt (ptrdiff\_t) upc\_addrfield} (S2) -  {\tt 
(ptrdiff\_t) upc\_addrfield(S1))} shall
-   evaluate to the same value as ((P2 - P1) * sizeof(T)).
 \end{itemize}
+\xchangenote[id=SV]{107}{Constraint on {\tt upc\_addrfield} strengthened and 
moved to Section~\ref{upc_addrfield}.}

 \np Two compatible pointers-to-shared which point to the same
     object (i.e. having the same address and thread components) shall

Original comment by sdvor...@cray.com on 28 Feb 2013 at 9:15

GoogleCodeExporter commented 9 years ago

Original comment by sdvor...@cray.com on 28 Feb 2013 at 9:18

GoogleCodeExporter commented 9 years ago
More crossing-in-the-ether.

My comment #40 requests deferral of the debate over per-thread uniqueness.
HOWEVER, I see that Steve's comment #39 makes the property conditional on the 
width of size_t.  I find that the be an ideal response to the objection that 
Dan raised (that I had already alluded to in comment #17).

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 9:23

GoogleCodeExporter commented 9 years ago
Another issue not yet mentioned that we need to be careful with here is what we 
guarantee about PTS arguments to upc_addrfield that reference objects with a 
limited lifetime (dynamically allocated shared data), and what we guarantee 
about the return value. I think there are two parts to this:

1) Add some requirement that the argument to upc_addrfield points to a (valid) 
shared object during the call. Paragraph 1 weakly implies this, but if we're 
strengthening constraints then I believe it should be directly stated. 
Something like "If ptr does not point to a shared object, behavior is 
undefined."

2) Be careful not to imply that upc_addrfield values "outlive" the object used 
to generate them. The current proposal text makes extensive use of the term 
"byte", but note that a "byte" may correspond to unallocated space containing 
no valid objects. For example if you did something like:

shared void *p1 = upc_alloc(4096);
size_t r1 = upc_addrfield(p1);
upc_free(p1);
shared void *p2 = upc_alloc(4096);
size_t r2 = upc_addrfield(p2);

It may happen to be the case that after this code (p1 == p2), but as they 
reference different objects I'm don't think its a good idea to require (r1 == 
r2). Conversely, if it happens to be the case that (p1 != p2), it seems like a 
bad idea to prohibit (r1 != r2). 

Original comment by danbonachea on 28 Feb 2013 at 9:26

GoogleCodeExporter commented 9 years ago
/s/prohibit (r1 != r2)/require (r1 != r2)/

Original comment by danbonachea on 28 Feb 2013 at 9:27

GoogleCodeExporter commented 9 years ago
Getting down to the nitty-gritty now:

Re: "If two pointers-to-shared {\tt S1} and {\tt S2} point to bytes with 
affinity to the same thread that are part of a single shared object"

I don't like "point to bytes", because in mind S1 and S2 don't "point to bytes" 
unless their type is a pointer to char or array of char (where I mean "char" in 
all its signed and unsigned variations including typedefs like uint8_t).

Alas, I don't have a good alternative suggestion off the top of my head.
I'll look at C99 to see if I can find an analogous passage for guidance.
Perhaps I'll find that "bytes" is the right term after all.

Original comment by phhargr...@lbl.gov on 28 Feb 2013 at 9:31