Intrepid / upc-specification

Automatically exported from code.google.com/p/upc-specification
0 stars 1 forks source link

Request for a plain English description of memory consistency + a portable battery of consistency tests #46

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I would like to see a less formal treatment of relaxed memory consistency in 
UPC; maybe in the style of the MPI specification's "advice for implementors".

Original issue reported on code.google.com by ga10...@gmail.com on 23 May 2012 at 12:59

GoogleCodeExporter commented 9 years ago
Strict consistency
===================

I believe that there is general agreement on what strict consistency means. I 
associate it with sequential consistency; that is, given two write to shared 
memory the order in which those writes are "seen" by any thread is always the 
same.

The crucial sequential consistency test, IIRC from the Shasha+Snir paper a long 
time ago, is this:

-------------------------------
Given 

shared int a;
shared int b;

with initial values of 0, and being overwritten (somewhat simultaneously) with 
new values 1 by two different threads, there *cannot* be two UPC threads that 
read the following value combinations:

Thread X reads a==0 *and* b==1
Thread Y reads a==1 *and* b==0
----------------------------------

I somehow doubt that all UPC implementations will stand up to this test, 
especially if the thread identities of the "owners" of the shared variables, 
the writers and the readers are all different.

Original comment by ga10...@gmail.com on 23 May 2012 at 1:00

GoogleCodeExporter commented 9 years ago
UPC spec section 5.1.2.3 already contains a "plain english" description 
detailing the high-level consequences of the formal memory model and providing 
some rules-of-thumb for regular users who are not versed in memory consistency 
models. It is copied below for convenience. As stated in the text this 
executive summary is NOT intended to replace the formal memory model in the 
appendix, which is the final authority on detailed behavior which is intended 
to resolve arcane consistency violation questions that can arise when 
implementing optimizations, and I feel it should remain for that purpose.

If anyone feels the current "summary" text is insufficient, please comment upon 
what further clarifications are required - otherwise I move this issue be 
closed as "INVALID".

=============================
In a shared memory model such as UPC, operational descriptions of semantics are 
insuf-
cient to completely and denitively describe a memory consistency model.
Therefore Appendix B presents the formal memory semantics of UPC. The
information presented in this section is consistent with the formal semantic
description, but not complete. Therefore, implementations of UPC based on
this section alone may be non-compliant.

2 All shared accesses are classified as being either strict or relaxed, as 
described
in sections 6.5.1.1 and 6.7.1. Accesses to shared objects via pointers-to-local
behave as relaxed shared accesses with respect to memory consistency. Most
synchronization-related language operations and library functions (notably
upc fence, upc notify, upc wait, and upc lock/upc unlock) imply the consistency
eects of a strict access.

3 In general, any sequence of purely relaxed shared accesses issued by a given
thread in an execution may appear to be arbitrarily reordered relative to
program order by the implementation, and different threads need not agree
upon the order in which such accesses appeared to have taken place. The
only exception to the previous statement is that two relaxed accesses issued
by a given thread to the same memory location where at least one is a
write will always appear to all threads to have executed in program order.
Consequently, relaxed shared accesses should never be used to perform 
deterministic
inter-thread synchronization - synchronization should be performed
using language/library operations whenever possible, or otherwise using only
strict shared reads and strict shared writes.

4 Strict accesses always appear (to all threads) to have executed in program
order with respect to other strict accesses, and in a given execution all 
threads
observe the effects of strict accesses in a manner consistent with a single,
global total order over the strict operations. Consequently, an execution of
a program whose only accesses to shared objects are strict is guaranteed to
behave in a sequentially consistent [Lam79] manner.

5 When a thread's program order dictates a set of relaxed operations followed
by a strict operation, all threads will observe the effects of the prior 
relaxed
operations made by the issuing thread (in some order) before observing the
strict operation. Similarly, when a thread's program order dictates a strict
access followed by a set of relaxed accesses, the strict access will be observed
by all threads before any of the subsequent relaxed accesses by the
issuing thread. Consequently, strict operations can be used to synchronize
the execution of dierent threads, and to prevent the apparent reordering of
surrounding relaxed operations across a strict operation.

6 NOTE: It is anticipated that most programs will use the strict synchronization
facilities provided by the language and library (e.g. barriers, locks,
etc) to synchronize threads and prevent non-determinism arising from data
races. A data race may occur whenever two or more relaxed operations from
dierent threads access the same location with no intervening strict 
synchronization,
and at least one such access is a write. Programs which produce
executions that are always free of data races (as formally dened in Appendix
B), are guaranteed to behave in a sequentially consistent manner.

Original comment by danbonachea on 31 May 2012 at 6:32

GoogleCodeExporter commented 9 years ago
Dan wrote:
> If anyone feels the current "summary" text is insufficient, please comment 
upon
> what further clarifications are required - otherwise I move this issue be 
closed
> as "INVALID".

I just reread that text and despite already knowing the memory model (or at 
least believing that I do), I didn't come away from the reading feeling I could 
write non-trivial programs based on it.  In other words, I got to the end and 
felt that the model was too complex to consider any approach other than that in 
"6 NOTE:..." of ensuring that my programs never contain a data race as 
described.  Perhaps that is all the reader NEEDS to some away with.

I don't have any improvements to offer and agree with Dan that unless somebody 
has a concrete suggestion (or more that one), this is a "WONTFIX" or "INVALID" 
issue.

Original comment by phhargr...@lbl.gov on 1 Jun 2012 at 5:18

GoogleCodeExporter commented 9 years ago
The high-level implication appears at the end of each paragraph. So to 
summarize the "rules of thumb" already presented in the current spec:

"Relaxed shared accesses should never be used to perform deterministic
inter-thread synchronization - synchronization should be performed
using language/library operations whenever possible, or otherwise using only
strict shared reads and strict shared writes."

"An execution of a program whose only accesses to shared objects are strict is 
guaranteed to behave in a sequentially consistent [Lam79] manner."

"Strict operations can be used to synchronize the execution of different 
threads, and to prevent the apparent reordering of surrounding relaxed 
operations across a strict operation."

Relaxed memory models are inherently complicated - I don't know how to make the 
description any simpler or more straightforward without sacrificing 
correctness, but I'm open to suggestions.

Note this section is not intended as a manual for rolling your own 
synchronization operations. It provides a high-level starting point for 
understanding the details of the memory model, where the most important points 
are basically "warning: relaxed operations can be reordered" and "use the 
provided sync ops to synchronize (whenever possible), otherwise you need to use 
strict ops".

Hearing no suggestions to improve the existing spec text, I'm closing this 
issue for now.

Original comment by danbonachea on 14 Jun 2012 at 11:10

GoogleCodeExporter commented 9 years ago
Mass update to closed issue status

Original comment by danbonachea on 15 Jun 2012 at 6:39

GoogleCodeExporter commented 9 years ago

Original comment by gary.funck on 6 Jul 2012 at 9:04