andre-richter / pcie-lat

Generic x86_64 PCIe latency measurement module for the Linux kernel
GNU General Public License v2.0
56 stars 15 forks source link

pcie-lat

A generic x86_64 PCIe latency measurement module for the Linux Kernel.

PCIe latencies for a device are measured via the execution time (via x86 Time Stamp Counter ticks) of reading a 32 Bit word from a PCIe device (aka round-trip time CPU->PCIe Device->CPU).

Users can specify:

Important

Disable CPU power-saving features (SpeedStep/TurboBoost) that modify CPU clock in order to minimize result variances

Usage

Tested on Ubuntu 14.04 and 16.04. First of all, become su:

sudo su

If the device you want to measure is currently bound to a driver, release it:

echo 0000:08:10.0 > /sys/bus/pci/devices/0000\:08\:10.0/driver/unbind

In the following, you'll need device and vendor ids of the PCIe device you want to measure. If you don't know them yet, look them up via lspci:

lspci -nn -s 08:10.0

> 08:10.0 Ethernet controller [0200]: Intel Corporation 82576 Virtual Function [8086:10ca] (rev 01)

Build the kernel module and insert it. Supply device and vendor ids to insmod via the ids argument:

make
insmod ./pcie-lat.ko ids=8086:10ca

If you want to add additional devices later on, you can do so via sysfs:

echo "10ee 7014"  > /sys/bus/pci/drivers/pcie_lat/new_id
echo 0000:20:00.0 > /sys/bus/pci/drivers/pcie_lat/bind

Execute measurements via the supplied ruby script. Mandatory argument is the PCIe device BDF. Offset, BAR and loop count is optional:

ruby measure.rb -p 08:10.0 -l 1000000 -b 0 -o 0x0

> TSC freq:     2294470000.0 Hz
> TSC overhead: 52 cycles
> Device:       08:10.0
> BAR:          0
> Offset:       0x0
> Loops:        1000000
>
>        | Results (1000000 samples)
> ------------------------------------------------------
> Mean   |   3628.02 cycles |   1581.20 ns
> Stdd   |     30.69 cycles |     13.37 ns
>
>
>        | 3σ Results (995274 samples, 0.005% discarded)
> ------------------------------------------------------
> Mean   |   3627.52 cycles |   1580.99 ns
> Stdd   |     27.69 cycles |     12.07 ns
>
> writing 3σ values (in ns) to file...

Visualization

The script saves the 3σ values of the measurement run into a csv file, e.g. lat_1000000_loops_3sigma.csv. You can generate a histogram via NumPy/matplotlib:

python hist.py lat_1000000_loops_3sigma.csv

Example Output:

Screenshot

All in one Python script

An alternative to run pcie-lat automatively is using all_in_one.py, which executes all configurations and visualizations above. Note that the final output graph plots all latency distributions with the Y axis in a logarithmic scale. Also run the script as root:

python3 all_in_one.py 00:1f.4 1000000

Remarks

Credits

License

Copyright (C) 2014 by the author(s)

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.