Closed mwheinz closed 2 years ago
Looks like this problem does not exist in 4.0.2. I haven't figured out which commit corrects the issue, however.
Looks like this is the known issue #6565. Fix is in master and 4.0.2 but not in the 3.1.x branch.
This does not seem to be true.
Okay - I tried backporting the patch from #6565 because it fit much of the description, but it does not actually fix the problem for 3.1.4. I tried testing 3.1.5 but failed to build it due to the GLIBC_PRIVATE issue.
Running MPI processes under Open MPI 4.0.2, I noticed that those /dev/shm/vader_segment.*
files stick around after theses processes are terminated via SIGTERM
.
That sounds like a memory leak waiting to cause more severe issues.
These files are supposed to be cleaned up by PMIx. Not sure why that isn't happening in this case.
FWIW: we discussed this on the weekly OMPI call today:
I examined OMPI v4.0.2 and it appears to be doing everything correctly (ditto for master). I cannot see any reason why it would be leaving those files behind. Even the terminate-by-signal path flows thru the cleanup.
No real ideas here - can anyone replicate this behavior? I can't on my VMs - it all works correctly.
@rhc54, I can confirm it's working with 4.0.2
. However, I can reliably reproduce the behavior using 3.1.x
.
I think it's the same underlying I'm running into in #7308: If another user left behind a segment file and there is a segment file name conflict with my current job, the run will abort with "permission denied" as the existing segment file can't be opened.
As @jsquyres pointed out, it seems like it is an issue with PMIx 2.x. While @hjelmn is looking into possible workarounds, I'm wondering if we can use PMIx 3.x with Open MPI 3.1.5?
Sorry for the confusion: It was a bug in our setup. I can now confirm that /dev/shm/vader* files are cleaned up after SIGTERM in Open MPI 4.0.2.
@mwheinz can you check to see if: https://github.com/open-mpi/ompi/pull/10040
fixes this issue for you? I noticed the same thing on master recently.
I miss-read this issue. It appears it only happens in the mpi v3 series - which is frozen. Since it is fixed in v4 and beyond, this should probably be closed.
I confirmed that #10040 is a master/v5 regression - it works on v4/4.1.
v5.0.x pr: https://github.com/open-mpi/ompi/pull/10046
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
3.1.4
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Packaged with Intel OPA 10.10.0.0.445
Please describe the system on which you are running
Back-to-back Xeon systems running RHEL 7.6 on one and RHEL 8.0 on the other.
Details of the problem
I was using OMPI to do some stress testing of some minor changes to the OPA PSM library, when I discovered that the vader transport appears to be leaking memory mapped files.
I wrote a bash script to run the OSU micro benchmarks in a continuous loop, alternating between using the PSM2 MTL and the OFI MTL. After a 24 hour run, I ran into some "resource exhausted" issues when trying to start new shells, execute shell scripts, etc..
Investigating, I found over 100k shared memory files in /dev/shm, all of the form
vader_segment.<hostname>.<hex number>.<decimal number>
It's not clear at this point that the shared memory files are the cause of the problems I had, but they certainly shouldn't be there!
Sample run lines:
Script that was used to run the benchmarks: