We noticed a failure while running SAT under Linux back to back. test_loop.sh
shows how it can reproduce. The failure mode is that one of the
SAT applications will fail because of an allocation failure. We looked into
this and found that there are a couple of scenarios that will cause the memory
not to be freed to the OS at process exit time frame.
1. libc refuses to free the memory back to the OS on free(). memalign() isn't
guaranteed to work with free(). However, glibc claims it does. posix_memalign()
is suppose to work with free(), but in our tests strace showed that glibc
refused to give back the virtual memory space to the OS.
Once the above behavior occurs the following can happen which cause the memory
not to be released to the OS.
a. Someone reading certain /proc files of SAT's can take a reference on the
OS's mm structure. This can lead to a delayed free of the mm's memory.
b. There can be a thread deep in a run queue that is signalled to die but
hasn't yet. The memory won't be given back to the OS until this last reference
to the mm is dropped.
My proposal is to perform the explicit mmap()/munmap() in SAT so that external
reference counts on the mm within Linux won't cause the memory to be tied up. A
patch is attached to do just that.
Original issue reported on code.google.com by adur...@google.com on 16 Sep 2010 at 5:39
Original issue reported on code.google.com by
adur...@google.com
on 16 Sep 2010 at 5:39Attachments: