'lit' should execute everything "internally"

llvmbot commented 15 years ago


Bugzilla Link	5241
Version	trunk
OS	All
Reporter	LLVM Bugzilla Contributor
CC	@modocache,@rnk

Extended Description

'lit' currently has code for executing tests "internally", but we currently execute tests using shell for performance/scalability reasons.

We should find a good way to execute things internally and still get good performance. Ideally this would also solve other related problems like improving the behavior of lit when it gets a SIGINT, implementing timeouts for tests, and maybe others.

Some possible implementation strategies are:

Use the python multiprocess module. I'm skeptical about its robustness, though.
Implement some components in C++. This is painful to get right and be cross-platform, but it would at least motivate getting nice sophisticated LLVMSystem process (and thread, and pipe) support. OTOH this would be very robust, and interact well with the GIL and signals.
Wait for Unladen Swallow to kill the GIL? :)

Notice how no option is particularly compelling...

rnk commented 8 years ago

I recently spent some time looking at uses of 'REQUIRES: shell', and the two main things that stand out are:

Environment variable use, like env LD_LIBRARY_PATH="mything:$LD_LIBRARY_PATH". I think we'll need to do some simple environment variable substitution at some point to deal with this.
Globs, like 'cat crashfile-*.cpp' when the output path is unpredictable. I think we can probably just implement this in lit.
People use REQUIRES shell to disable tests that are flaky on Windows due to weird NTFS problems. We should migrate these to can-remove-opened-file or something, even though that doesn't quite capture the issue.

rnk commented 8 years ago

Pasting some more relevant info from the duped bug: (This task was copied from https://reviews.llvm.org/diffusion/L/browse/llvm/tags/RELEASE_390/final/utils/lit/TODO. It was originally written by Daniel Dunbar.)

Currently, the ShTest format uses tests written with shell-script like syntax, and executes them in one of two ways. The first way is by converting them into a bash script and literally executing externally them using bash. The second way is through the use of an internal shell parser and shell execution code (built on the subprocess module). The external execution mode is used on most Unix systems that have bash, the internal execution mode is used on Windows.

Having two ways to do the same thing is error prone and leads to unnecessary complexity in the testing environment. Additionally, because the mode that converts scripts to bash doesn't try and validate the syntax, it is possible to write tests that use bash shell features unsupported by the internal shell. Such tests won't work on Windows but this may not be obvious to the developer writing the test.

Another limitation is that when executing the scripts externally, the ShTest format has no idea which commands fail, or what output comes from which commands, so this limits how convenient the output of ShTest failures can be and limits other features (for example, knowing what temporary files were written).

We should eliminate having two ways of executing the same tests to reduce platform differences and make it easier to develop new features in the ShTest module. This is currently blocked on:

The external execution mode is faster in some situations, because it avoids being bottlenecked on the GIL. This can hopefully be obviated simply by using --use-processes.
Some tests in LLVM/Clang are explicitly disabled with the internal shell (because they use features specific to bash). We would need to rewrite these tests, or add additional features to the internal shell handling to allow them to pass.

rnk commented 8 years ago

Bug llvm/llvm-bugzilla-archive#30667 has been marked as a duplicate of this bug.

llvmbot commented 15 years ago

Note that one reasonably viable option is a variant of 1., where we hand roll a pure Python multiprocessing based solution.

We can also make #2 substantially simpler by committing to using temporary files between test commands, instead of pipes, which isn't unreasonable since it might be nice to have them around for showing to users in the case of failures, and should have no negative performance impact.

llvm / llvm-project

'lit' should execute everything "internally" #5613

Extended Description