upperwal / EntangledMPI

Fault Tolerance framework for High Performance Computing [Supports ULFM, replication and checkpointing]
MIT License
2 stars 1 forks source link

NAS Parallel Benchmark's Verification is mostly UNSUCCESSFUL #32

Closed upperwal closed 6 years ago

upperwal commented 6 years ago

Possible Cause: We are using PMPI_Waitany with same buffer for all Irecv request. Although some request completes successfully but the next request write to the same buffer which might be causing corrupted data

Solution: Try to use PMPI_Waitall again.

upperwal commented 6 years ago

Yup, that was the case. Still testing.

upperwal commented 6 years ago

Now they are SUCCESSFUL