intel / lmbench

GNU General Public License v2.0
265 stars 117 forks source link

lmbench takes too long for large nvme disks #6

Open AnandBibhuti opened 5 years ago

AnandBibhuti commented 5 years ago

Ran lmbench after obtaining source from https://github.com/intel/lmbench/ and getting binary after compilation.

Following command was used to run lmbench:

lmbench

Following is the config file content :

DISKS="" DISK_DESC="" OUTPUT="/dev/tty" ENOUGH=5000 FASTMEM="NO" FILE="/usr/tmp/XXX" FSDIR="/usr/tmp" INFO=INFO.myserver.com LINE_SIZE=128 LOOP_O=0.00000030 MAIL=no TOTAL_MEM=509856.46875 MB=407885 MHZ="1494 MHz, 0.6693 nanosec clock " MOTHERBOARD= NETWORKS= OS=x86_64-Linux PROCESSORS=40 REMOTE= SLOWFS="NO" SYNC_MAX="1" LMBENCH_SCHED="DEFAULT" TIMING_O=0 RSH=rsh RCP=rcp VERSION=3.0-20100921

BENCHMARK_HARDWARE=NO BENCHMARK_OS=NO BENCHMARK_SYSCALL=NO BENCHMARK_SELECT=NO BENCHMARK_SIG=NO BENCHMARK_PROC=NO BENCHMARK_CTX=NO BENCHMARK_PAGEFAULT=NO BENCHMARK_FILE=NO BENCHMARK_MMAP=NO BENCHMARK_PIPE=NO BENCHMARK_UNIX=NO BENCHMARK_UDP=NO BENCHMARK_TCP=NO BENCHMARK_CONNECT=NO BENCHMARK_RPC=NO BENCHMARK_HTTP=NO BENCHMARK_BCOPY=NO BENCHMARK_MEM=NO BENCHMARK_OPS=NO DISKS=/dev/nvme0n1p2 DISK_DESC="none"


With large size nvme disks (2TB or more) lmbench sometimes goes for hours (even for more than a day) , and gets stuck at "Calculating disk zone bw & seek times" of output. This unusual long time for lmbench completion is not seen with non-nvme disks. This is seen only with large size nvme disk (in TBs).

AnandBibhuti commented 4 years ago

Got latest source which was updated recently and ran disk binary with a disk of size 5.8 T.

[root@localhost ]# lsblk | grep nvme0n1 nvme0n1 259:0 0 5.8T 0 disk

[root@localhost SOURCES]# disk /dev/nvme0n1

The above command does not get completed even after hours. The size of result file grows too big. Calculating disk zone bandwidth takes too long.

Making code changes as mentioned in diff below works fine for all disk sizes.

[root@localhost lmbench-master]# diff -Nrup orig_disk.c src/disk.c
--- orig_disk.c 2019-10-16 02:45:06.193140852 -0400
+++ src/disk.c  2019-10-16 04:49:34.824774418 -0400
@@ -49,7 +49,7 @@ zone(char *disk, int oflag, int bsize)
        int     n;
        int     fd;
        uint64  off;
-       int     stride;
+       uint64  stride;

        if ((fd = open(disk, oflag)) == -1) {
                perror(disk);
@@ -88,8 +88,8 @@ zone(char *disk, int oflag, int bsize)
        if (bsize > stride) stride = bsize;

        off *= ZONEPOINTS;
-       debug((stdout, "stride=%d bs=%d size=%dM points=%d\n",
-           stride, bsize, (int)(off >> 20), (int)(off/stride)));
+       debug((stdout, "stride=%u bs=%d size=%uM points=%u\n",
+           stride, bsize, (uint64)(off >> 20), (uint64)(off/stride)));

        /*
         * Read buf's worth of data every stride and time it.
@@ -142,12 +142,12 @@ seek(char *disk, int oflag)
 {
        char    *buf;
        int     fd;
-       off64_t size;
-       off64_t begin, end;
+       uint64  size;
+       uint64  begin, end;
        int     usecs;
        int     error;
        int     tot_msec = 0, tot_io = 0;
-       int     stride;
+       uint64  stride;

        if ((fd = open(disk, oflag)) == -1) {
                perror(disk);
@@ -174,8 +174,8 @@ seek(char *disk, int oflag)
        stride >>= 9;
        stride <<= 9;

-       debug((stdout, "stride=%d size=%dM points=%d\n",
-           stride, (int)(size >> 20), (int)(size/stride)));
+       debug((stdout, "stride=%u size=%uM points=%u\n",
+           stride, (uint64)(size >> 20), (uint64)(size/stride)));

        end = size;
        begin = 0;