uw-cmg / MAST

MAterials Simulation Toolkit for use with pymatgen
17 stars 8 forks source link

Limit to looped MAST input file #428

Open akacz opened 9 years ago

akacz commented 9 years ago

When parsing an input file that creates many directories, MAST will error due to too many open files. It will not be able to create the necessary metadata files.

cmgtam commented 9 years ago

Hello Zhewen, Thank you for reporting the issue. I've seen this before, and I believe it is a duplicate of this one Amy reported, #428.

One thing that I have found helps is to use a compute node - for example, if your are on ACI, then using their interactive srun command srun -N 1 -n 16 -p int --pty bash and then running MAST may in some cases, with the same input file, not throw the error.

I will look at the code.

From, Tam

On 12/06/14, Zhewen Song wrote:

Hello Tam,

Sorry for the bothering. I am running the SiC defect calculations with MAST. 22 types of defects are considered, each with the charge states ranging from -2 to 2, and there all 3+1 sizes of GGA and 1 size for HSE (~550 calculations in total). At the very beginning, MAST generated ~1000 folders (including the inducedefect folders). And then I encountered the error saying

File "/home/zsong39/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/MAST-1.2.0-py2.7.egg/MAST/utility/metadata.py", line 48, in search_data with open(self.keywords['metafile'], 'r') as metafile: IOError: [Errno 24] Too many open files: '/home/zsong39/MAST/SCRATCH/3c-SiC_SiC_20141206T114744/defect_2xlow_C-110-Si_q=n2_GGA/metadata.txt'

So obviously too many folders were opened in the loop with open(self.keywords['metafile'], 'r') as metafile , and it just crashed.

I see there are a lot of with open(self.keywords['metafile'], 'r') as metafile elsewhere in the codes. Is there a better way to search keywords than brutally opening all the files simultaneously? This way will crash especially the user has a huge amount of MAST ingredients.

Thanks Zhewen

On 11/14/14, akacz wrote:

When parsing an input file that creates many directories, MAST will error due to too many open files. It will not be able to create the necessary metadata files.

— Reply to this email directly or view it on GitHub(https://github.com/uw-cmg/MAST/issues/428).

cmgtam commented 9 years ago

ulimit -a will give number of open files allowed.

The problem actually appears to be not the metadata file, but the logging files - the files are opened, but the handlers are not removed and flushed. Removing and flushing the handler after write_directory appears to solve the problem.

This problem may also be present when checking the recipe during mast running, so we would need to look out for it there. However, open file limits may be relaxed on the compute nodes (where the mast monitor runs) which would make this not a problem (which is probably the ACI difference: submission node has open file limit of 1024, while it is 4096 on the interactive compute node).

cmgtam commented 9 years ago

When checking a recipe, ingredients are initiated if they need to be checked for their status by checking their methods (ready, complete, etc.). So, only a fraction of the ingredients will be initiated and will get loggers during the mast monitor phase. However, the logs should probably be cleaned up properly...

cmgtam commented 9 years ago

I suppose having 4000 recipes running might also run into this issue.