Open GoogleCodeExporter opened 9 years ago
Tagging Usability and Performance. Will defer to Owner on Milestone version,
but suggest 2.0.
Cray UPC handles this issue by having upc_fence poll and explaining to the user
that they may want to fence periodically during a long-running computation if
they are doing things that aren't truly one-sided in our implementation (e.g.,
upc_free() of data with affinity to a different PE).
Original comment by johnson....@gmail.com
on 15 Jun 2012 at 6:13
IBM and Berkeley[1] say: spin on upc_poll()
Cray says: spin on upc_fence()
In the Berkeley case, upc_fence() would also work, but we provide upc_poll() to
"make progress" without also having the fencing property (becomes a no-op in
the "pure pthreads" case where there is no network to poll).
So, I in favor of DISCUSSING whether we want upc_poll() in the language spec.
There should be no problem with
#define upc_poll upc_fence
as a trivially correct implementation.
To me the crux of the discussion is whether the inclusion of upc_poll() is
useful, or just a horrible substitute for a true progress guarantee.
My initial thought is that if we believe that MPI's experience "proves" that
explicit polling is good enough, then upc_poll() would just be an optimization
as George suggests. HOWEVER, I don't think the current UPC specification does
anything that precludes writing CORRECT code that {dead,live}locks if one
assumes true asynchronous progress is made. So, I would argue that as the spec
currently stands, any implementation (my own included) which requires insertion
of poll/fence calls to ensure progress is BROKEN. Therefore, I would argue
that upc_poll() does NOT belong in the specification (as a mechanism to avoid
implementation limitations).
[1] I've mentioned before that Berkeley avoids placing our extensions into the
upc_* namespace. The case of upc_poll() predates our realization of how doing
this can lead to later headaches.
Original comment by phhargr...@lbl.gov
on 18 Jun 2012 at 10:32
We need to identify what parts of UPC require polling for progress in current
implementations. That information is useful to this discussion whether or not
explicit polling is added to the UPC spec. (If polling is added, then users
need guidance on when they should poll. If polling is not added, then we need
the information to figure out how we can live without polling.)
For example, Cray does not need polling to handle Get, Put, or AMO operations,
but we need it to handle the following:
1) upc_global_{lock_}alloc - One thread calls upc_global_{lock_}alloc and all
threads must perform an allocation.
2) upc_free - One thread calls upc_free to deallocate memory that has affinity
to a different thread.
3) upc_global_exit - One thread terminates all threads.
In all of these cases, one thread does something that requires action by other
threads. For (1), we discourage users from calling the function and advise
them to use upc_all_{lock_}alloc for better performance. For (2), we see it in
test cases for upc_free, but have not seen it in a real application. For (3),
generally it is called after an error is detected and the function's
description makes no guarantee about how quickly it must terminate the
application, so performance is not a concern provided that the original thread
continues to respond to other threads until they have exited.
I'll note that we used to require polling for upc_memcpy for the case where
neither the source nor the destination had affinity to the calling thread
because we used to cause a direct transfer from the source to the destination,
but it turned out that if a user actually writes such code, then they expect
the calling thread to perform the copy itself via temporary buffering.
Original comment by johnson....@gmail.com
on 19 Jun 2012 at 4:26
Tagged for the version 1.4 specification milestone.
Although we may be able to reach consensus on the progress guarantees, need (or
lack thereof) for polling, I doubt that there is sufficient time to re-work
implmentations in the near term, for example, if the decision is made to remove
user-level polling requirements across the board.
Original comment by gary.funck
on 2 Jul 2012 at 4:07
Original issue reported on code.google.com by
ga10...@gmail.com
on 23 May 2012 at 1:47