Intrepid / upc-specification

Automatically exported from code.google.com/p/upc-specification
0 stars 1 forks source link

Library: collectives 2.0 #42

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
This is to log the UPC collectives 2.0 extensions.

For reference, the UPC Collectives 2.0 proposal can be found here:
http://upc.lbl.gov/publications/UPC-Collectives-PGAS11.pdf

Original issue reported on code.google.com by yzh...@lbl.gov on 22 May 2012 at 11:45

GoogleCodeExporter commented 9 years ago
Feedback from "users" to be resolved:

    Some initial comments and questions regarding "Collectives 2.0":

    Things that are good:
      --Teams
      --The greater variety of collectives
      --Support for asynchronous calls

    On the other hand, it is very complicated, the function calls have too many parameters, and it isn't integrated well into the rest of UPC.

    Some specific comments:

    * It invents its own handles and synchronization functions for asynchronous operations, rather than using the ones from the existing "non blocking" proposals.

    * I understand why they would want to have initialization and finalization functions while prototyping their implementation, but I don't think they belong in the actual language. (In particular, such functions make it much more difficult to write library functions using these collectives.)

    * They start by condemning the synchronization flags, but then they keep most of them.  May I suggest getting rid of the flags altogether (aside from a single bit indicating whether the operation is asynchronous, which could be passed another way)?  Anyone who wants ALL_SYNC behavior can just put in explicit barriers.

    * Allowing every thread to specify their buffer independently could be useful, but I'm concerned about what constraints it might place on the implementation.

    * The functions themselves have far too many arguments (many of which seem to be redundant). For example, broadcast requires 10(!) arguments.

    * Some specific questions:
          +Why do all of the functions needs type variables?  For something like broadcast, do we care about anything but the size of the data being broadcast?
          +Why does broadcast (for example) have type and size values for both sending and receiving?  What is it supposed to mean if these aren't equal?

    * They put the source parameter before the destination parameter, in defiance of the standard convention in both C and UPC.  Isn't this confusing enough without being inconsistent, too?

    * The list of official type constants seems too long.  What is the difference between BYTE and UCHAR in terms of how the collectives behave?  And when would someone actually use esoteric types such as UPC_FLOAT_INT  (given that it apparently has sizeof(float)+sizeof(int) rather than sizeof (struct { float x; int y;} )?

    * By their own admission, this rigid type-based design means that their collective functions can't be used with user-defined structs.  Since this is the main way we use things like broadcast, this is a serious limitation.

Original comment by yzh...@lbl.gov on 14 Jun 2012 at 6:04

GoogleCodeExporter commented 9 years ago
To pick up just one issue from the many valid questions raised. We discussed 
the issue of handles and so on. I would have liked to use upc_fence to 
guarantee completion of a non-blocking collective. Problem is, this requires 
changes in the "existing" compiler, rather than just adding a library. We 
considered that a no-go at the time.

OTOH if there are other examples where handles are used by other libraries, 
maybe we can put a *little bit* of pressure on the language specification to 
accept handles and guarantee completion based on upc_fence?

Original comment by ga10...@gmail.com on 15 Jun 2012 at 2:50

GoogleCodeExporter commented 9 years ago
Having upc_fence guarantee completion of a non-blocking operation is already 
part of the Cray proposal for non-blocking memcpy functions.  Yes, it may 
require changes to existing fence implementations, but that is necessary for 
the fence to remain a true fence in the presence of these new non-blocking 
alternatives.

Original comment by johnson....@gmail.com on 15 Jun 2012 at 2:59

GoogleCodeExporter commented 9 years ago
I have "conceded" the fence-syncs-nb-memcpy argument in my latest 
counter-proposal offered in issue #41, but have preserve the explicit handles 
and the requirement to "sync" them.  I think if we can come to an agreement on 
that proposal (which seems closer to resolution than Collectives-2.0), then we 
should attempt to do something as analogous as possible here.

Original comment by phhargr...@lbl.gov on 16 Jun 2012 at 12:35

GoogleCodeExporter commented 9 years ago

Original comment by gary.funck on 3 Jul 2012 at 6:07

GoogleCodeExporter commented 9 years ago

Original comment by gary.funck on 3 Jul 2012 at 6:09

GoogleCodeExporter commented 9 years ago
All "brand new" library proposals are targeted for starting in the "Optional" 
library document. Promotion to the "Required" document comes later after at 
least 6 months residence in the ratified Optional document, and other 
conditions described in the Appendix A spec process.

Original comment by danbonachea on 17 Aug 2012 at 5:53