upperwal / EntangledMPI

Fault Tolerance framework for High Performance Computing [Supports ULFM, replication and checkpointing]
MIT License
2 stars 1 forks source link

Different stack starting address #22

Open upperwal opened 6 years ago

upperwal commented 6 years ago

Problem: main function starts with different addresses (ie. stack space is shifted). This causes replication and checkpointing to incorrectly place the stack data and generates seg fault

This problem is random. Most of the time stack starts from same address in all the MPI programs.

Reason: Not sure, but could be because mpirun inserts some env var during execution of the program.

upperwal commented 6 years ago

Still no solution