fujitsu / compiler-test-suite

Test suite for C/C++/Fortran compilers developed by Fujitsu
Apache License 2.0
9 stars 1 forks source link

Unexpected error caused by parallel file read/write #2

Open kawashima-fj opened 2 months ago

kawashima-fj commented 2 months ago

This is a known issue.

Problem

When tests are run in parallel, some test executions may fail randomly.

Details

The test runner lit executes multiple test programs in parallel by default. Test programs put in a same directory are executed in a same working directory (e.g. $build_dir/Fujitsu/Fortran/0060). Therefore, if multiple test programs in a same directory read/write a file with a same name, those file accesses may conflict and result in a test verification error.

This test suite has such test programs. Especially, Fortran programs like Fortran/0060/0060_0001.f90 are problematic. If an external unit specified in the WRITE statement is not connected, in other words, it is not preconnected to the standard output and the like (0, 5, 6) and it is not connected using the OPEN statement, Flang (and many other Fortran runtimes) creates a file with a name like fort.1 in the current directory. This often causes a file name conflict.

To avoid this error, this test suite has a lit setting not to run test programs in a same directory in parallel. See Fortran/0060/lit.local.cfg for an example. We put this setting in all the suspicious directory. However there may be other directories which have this problem. In this case, you may see unreproducible errors in a lit result.

Workaround

You can work around this issue by running tests in serial (lit -j1). However it inflates test execution time.

Future direction

We want to put the lit setting in all the problematic directories. However, we don't know how to find all the problematic directories correctly. Modifying all the problematic tests is not realistic.

If you find a problematic directory, please let us know in a comment of this issue.