openmopac / mopac

Molecular Orbital PACkage
http://openmopac.net
GNU Lesser General Public License v3.0
122 stars 32 forks source link

Concurrency issue in tests #64

Closed susilehtola closed 2 years ago

susilehtola commented 2 years ago

The tests appear to have concurrency issues when run on a machine with a large number of cores. For instance, in my test calculation on a machine with 64 cores, IONIZE.out gives the following output

$ cat IONIZE.out 

           END OF FILE FOUND WHILE TRYING TO READ IN GEOMETRY DATA FROM "NEUTRAL.MOP"
          (Error occurred while trying to read over the keyword line in the data-set.)

 *******************************************************************************
 *                                                                             *
 *     Error and normal termination messages reported in this calculation      *
 *                                                                             *
 *  END OF FILE FOUND WHILE TRYING TO READ IN GEOMETRY DATA FROM "NEUTRAL.MOP" *
 * JOB ENDED NORMALLY                                                          *
 *                                                                             *
 *******************************************************************************

 TOTAL JOB TIME:             0.00 SECONDS

 == MOPAC DONE ==

I believe this may be behind the failures on Fedora build systems, which have large numbers of cores.

The origin of the concurrency issue is likely https://github.com/openmopac/mopac/blob/993a3c6f3e28f9ab14d3f83662ffe5329e8d4666/tests/run_test.py#L13

which is not thread safe.

susilehtola commented 2 years ago

My suggestion would be to run each test in a dedicated directory, and hope this solves the issues I've run into with test failures...

godotalgorithm commented 2 years ago

I'm not sure that it is representative of all testing problems you are encountering, but Neutral.mop is a required auxiliary input file used by two tests. In the current scheme, it is copied to the testing directory by two different instances of the testing script, which may be running simultaneously in a multi-core setting. This error is likely to be the result of MOPAC attempting to read the file as part of one test while the script of another test is copying over it. The simple and appropriate solution to this problem is to eliminate all file dependencies between tests. MOPAC should be able to run many instances simultaneously in the same directory, so there shouldn't be any inherent thread-safety issues beyond this collision of input file dependencies.

I'll go through the tests and eliminate these file dependencies by creating multiple copies of shared input files with slightly different names. You are also welcome to pre-empt me in this task.