x10-lang / x10

Core implementation of X10 programming language including compiler, runtime, class libraries, sample programs and test suite
Eclipse Public License 1.0
70 stars 15 forks source link

Resilient Team Issues #3

Closed shamouda closed 8 years ago

shamouda commented 8 years ago

Fixing segmentation fault and hanging issues in Team when places fail. Modified sleepUntil and waitForParentToReceive to return the team status, and used the return value to avoid moving data to a failed place , and to terminate the collective when needed.

The fix is not complete, sleepUntil still hangs in some cases. The new class ResilientAllreduceTest includes a sample log for a hanging scenario.

0joshuaolson1 commented 8 years ago

This has gone stale.

milthorpe commented 8 years ago

Superseded by later pull requests.