stressapptest / stressapptest

Stressful Application Test - userspace memory and IO test
Apache License 2.0
572 stars 150 forks source link

stressapptest

Stressful Application Test (or stressapptest, its unix name) is a memory interface test. It tries to maximize randomized traffic to memory from processor and I/O, with the intent of creating a realistic high load situation in order to test the existing hardware devices in a computer. It has been used at Google for some time and now it is available under the apache 2.0 license.

(Exported from code.google.com/p/stressapptest)

Discussion group: https://groups.google.com/d/forum/stressapptest-discuss

Usage

To execute, a typical command would be:

./stressapptest -s 20 -M 256 -m 8 -W    # Test 256MB, running 8 "warm copy" threads. Exit after 20 seconds.
./stressapptest --help                  # list the available arguments.

Common arguments

Error handling

./stressapptest -s 20 -M 256 -m 8 -C 8 -W # Allocate 256MB of memory and run 8 "warm copy" threads, and 8 cpu load threads. Exit after 20 seconds.
./stressapptest -f /tmp/file1 -f /tmp/file2 # Run 2 file IO threads, and autodetect memory size and core count to select allocated memory and memory copy threads.

Installation

stressapptest is often available on linux and can be installed as a distro package:

sudo apt-get install stressapptest
sudo emerge stressaptest
sudo yum install stressapptest
sudo zypper install stressapptest

To build from source, the build/installation package follows the GNU guidelines. So, to download the latest package:

git clone https://github.com/stressapptest/stressapptest.git
cd stressapptest
./configure
make
sudo make install

And it should be installed. You can use the most common options on the configure script, it was generated by autoconf and automake, so they are accepted.

Objective

Stressful Application Test (or stressapptest) tries to maximize randomized traffic to memory from processor and I/O, with the intent of creating a realistic high load situation.

stressapptest may be used for various purposes:

Background

Many hardware issues reproduce infrequently, or only under corner cases. The theory being used here is that by maximizing bus and memory traffic, the number of transactions is increased, and therefore the probability of failing a transaction is increased.

Overview

stressapptest is a userspace test, primarily composed of threads doing memory copies and directIO disk read/write. It allocates a large block of memory (typically 85% of the total memory on the machine), and each thread will choose randomized blocks of memory to copy, or to write to disk. Typically there are two threads per processor, and two threads for each disk. Result checking is done as the test proceeds by CRCing the data as it is copied.

Detailed Design

The code is structured fairly simply:

A large amount of memory is allocated in a single block (default is 85% of physical memory size). Memory is divided into chunks, each filled with a potentially stressful data pattern. Worker threads are spawned, which draw pages from an "empty" queue and a "valid" queue, and copy the data from one block to the other. Some threads memory copy the data. Some threads invert the data in place. Some threads write the data to disk, and read it to the new location. After the specified time has elapsed, all "valid" pages have their data compared with the original fill pattern.

Caveats

This test works by stressing system interfaces. It is good at catching memory signal integrity or setup and hold problems, memory controller and bus interface issues, and disk controller issues. It is moderately good at catching bad memory cells and cache coherency issues. It is not good at catching bad processors, bad physical media on disks, or problems that require periods of inactivity to manifest themselves. It is not a thorough test of OS internals. The test may cause marginal systems to become bricks if disk or memory errors cause hard drive corruption, or if the physical components overheat.

Security Considerations

Someone running stressapptest on a live system could cause other applications to become extremely slow or unresponsive.

Logged information

stressapptest can output a logfile of miscompares detected during its execution. stressapptest cannot yet log reboot failures, or other failures not visible to user space.