Closed pavlis closed 3 weeks ago
I found a relevant discussion: https://www.mongodb.com/community/forums/t/too-many-open-files-mongo-4-2/8585/3
It seems unix OS has some limitations that can easily hit by MongoDB. MongoDB's documentation also has this topic on ulimit settings: https://www.mongodb.com/docs/manual/reference/ulimit/. Maybe you can give that a try. Now, how to set ulimit in the container seems also tricky, but I found this argument that might do it: https://docs.docker.com/reference/cli/docker/container/run/#ulimit
I saw similar postings but I found both the hard and soft limits returned by "ulimit" are given as "unlimited". Unless docker overrides that somehow that doesn't explain the odd behavior I am seeing.
Had this surface again and we did a bunch of detective work on this issue. @wangyinz should add some details, but a quick summary is this:
Working with this problem it was found that there was no real problem with the MsPASS code base to create this issue. As comments above suggest what happens is that docker has system limits for the simple reason that docker is used to assure an program plays nicely within a system's limits. Those limits are set in a configuration file that is system dependent. Some things like the amount of memory and number of cores the container can run can be set with docker desktop on Windows and MacOS - there seems to be no equivalent in linux. The parameter relevant to this issue (ulimit) does not, however, seem to be configurable that way. There are system dependent ways to set ulimit. It is that need that led me down the rathole I described above. For the record, however, the way to do that independent of the system is to add the following to the start of the docker run line if your notebook is subject to a ulimit problem:
docker run --ulimit nofile=100000:100000 # additional docker args follow
Users should be add that option if doing atomic processing of a large data set. The problem is most acute if the job is io bound from a lightweight processing set of tasks. @wangyinz found the problem surfaced from the default garbage collection scheduling failing to clear stale file handles. For that reason it may be also possible to fix this issue with forced calls to the garbage collection system as suggested in the dask documentation here
I am somewhat mystified by what I think is a MongoDB problem running a parallel job on an ubuntu desktop system using our current container.
The notebook earlier instantiates a MsPASS Database object and indexes a set of 20 miniseed files. The notebook also earlier successfully runs a serial version of the workflow I'll show in a second (i.e. not the same but converted to a for loop). After three variations I got this parallel version to run without crashing MongoDB:
The input is 26247 wf_miniseed documents returned by an implict find({}) in the way I called read_distributed_data. Note that in the first attempt with npartitions left at the default and a second try with
npartitions=1000
it failed to run. Both times the mongodb log showed a long string of "too many open file" messages before completely crashing. Why it works with 100 but not the default is a mystery to me since the dask documentation says the default is "about 100". When I look at the code the default passes npartitions as None, which is what the API says is the default. I did not verify the size with dask diagnostics so don't actually know what default did. In any case, it is mysterious why 1000 crashes MongoDB. When i watch it run it is running on the 8 core machine with 8 threads each of which are running at about 25%. The graph display shows 100 lines with three task bars - one for reader, processing, and one for writer.Why this is causing too many open files in all but the 100 context is bewildering. Any tools I can use to monitor open files? I don't think we have a problem with the writer per se as this example writes data to gridfs and there is no equivalent to open. Any ideas?