Unclosed open files pile up

smirarab commented 11 years ago

standard input pipes are opened (implicitly by subprocess) but are never closed:

When a subprocess call is made, the standard error and standard output file objects are correctly closed. However, the standard input, which is a pipe, is not ever closed. On Linux, these are file descriptors that will remain open. When SATe is run with a huge input, these open files pile up and eventually cause an exception (too many files open).

Solution: In schedeuler.py, under DispatchableJob, inside the wait method, there are the following lines:

                        self._stdout_fo.close()
                        self._stderr_fo.close()

We need to also close standard input pipe at this point. So change to:

                        self._stdout_fo.close()
                        self._stderr_fo.close()
                        self.process.stdin.close()

To test this, in Linux, run SATe and note the process id. Then run: watch 'lsof -p PROCESS_ID|wc -l' to monitor the file usage. Note that in current implementation, it keeps going up as iterations progress (setup your config file such that many small subsets are created). Run without wc and you will notice these extra files are pipe objects. After the fix is applied, the file count stays stable at around 30 files.

I am not applying the change, because I am not setup to test my changes under windows, and am running limited test cases currently. Please test and apply to the main branch if the fix works fine under all systems.

--Siavash

joaks1 commented 11 years ago

Nice catch, Siavash! Commit 7c310359d60dce37d851d3e3967d1b91f13ec325 should fix this issue as per your recommendation. The change has been tested successfully on Linux and Windows platforms. I have not run any really long tests, but confirmed that after 10 iterations, each with a lot of tree decomposition, the number of open files holds steady around 30. If you still see an increase in open files over very long runs, let me know.

Thanks!

Jamie

smirarab commented 11 years ago

In my runs on very large datasets this seems to be working fine. I will update you if I see further issues. Thanks for testing and applying the fix.

sate-dev / sate-core

Unclosed open files pile up #33