yuichiro-naito / OpenBSD-src

Public git conversion mirror of OpenBSD's official CVS src repository. Pull requests not accepted - send diffs to the tech@ mailing list.
https://www.openbsd.org
0 stars 0 forks source link

Poor System call performance of OpenBSD-6.6 #2

Open yuichiro-naito opened 4 years ago

yuichiro-naito commented 4 years ago

UnixBench-5.1.3 shows OpenBSD-6.6's syscall performance as follows.

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: openbsd.local: OpenBSD
   OS: OpenBSD -- 6.6 -- GENERIC.MP#372
   Machine: amd64 (unknown)
   Language: en_US.utf8 (charmap=, collate=)
   CPU: no details available
   5:42PM  up  1:42, 1 user, load averages: 0.32, 0.99, 0.67; runlevel 

------------------------------------------------------------------------
Benchmark Run: Tue Dec 10 2019 17:42:04 - 17:44:40
4 CPUs in system; running 1 parallel copy of tests

System Call Overhead                         287203.8 lps   (10.0 s, 7 samples)

System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0     287203.8    191.5
                                                                   ========
System Benchmarks Index Score (Partial Only)                          191.5

------------------------------------------------------------------------
Benchmark Run: Tue Dec 10 2019 17:44:40 - 17:47:17
4 CPUs in system; running 4 parallel copies of tests

System Call Overhead                         925148.8 lps   (10.0 s, 7 samples)

System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0     925148.8    616.8
                                                                   ========
System Benchmarks Index Score (Partial Only)                          616.8

I have 2 questions for this result.

  1. Why parallel perfomance is 3 times better than single? Although OpenBSD kernel has a giant lock that limites parallel execution in kernel.

  2. Why System call performance is poor than before? If the reason is mitigations for speculative execution vulnerabilities, I can speed up the syscall benchmark without the mitigations.

Compared to OpenBSD-6.1 result.

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: openbsd.local: OpenBSD
   OS: OpenBSD -- 6.1 -- GENERIC.MP#20
   Machine: amd64 (unknown)
   Language: en_US.utf8 (charmap=, collate=)
   CPU: no details available
   1:42PM  up 37 mins, 2 users, load averages: 1.35, 1.30, 1.15; runlevel 

------------------------------------------------------------------------
Benchmark Run: Mon Dec 16 2019 13:42:52 - 13:45:02
8 CPUs in system; running 1 parallel copy of tests

System Call Overhead                        1844933.2 lps   (10.0 s, 7 samples)

System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0    1844933.2   1230.0
                                                                   ========
System Benchmarks Index Score (Partial Only)                         1230.0

------------------------------------------------------------------------
Benchmark Run: Mon Dec 16 2019 13:45:02 - 13:47:13
8 CPUs in system; running 8 parallel copies of tests

System Call Overhead                        1908700.1 lps   (10.0 s, 7 samples)

System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0    1908700.1   1272.5
                                                                   ========
System Benchmarks Index Score (Partial Only)                         1272.5
yuichiro-naito commented 4 years ago

I removed mitigations for speculative execution vulnerabilities on #1, syscall UnixBench speeds up as follows. But it's still slower than OpenBSD-6.1.

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: openbsd.local: OpenBSD
   OS: OpenBSD -- 6.6 -- GENERIC.MP#5
   Machine: amd64 (unknown)
   Language: en_US.utf8 (charmap=, collate=)
   CPU: no details available
   12:35PM  59 secs, 1 user, load averages: 0.05, 0.01, 0.00; runlevel

------------------------------------------------------------------------
Benchmark Run: Mon Dec 23 2019 12:35:40 - 12:37:52
4 CPUs in system; running 1 parallel copy of tests

System Call Overhead                        1207207.4 lps   (10.0 s, 7 samples)

System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0    1207207.4    804.8
                                                                   ========
System Benchmarks Index Score (Partial Only)                          804.8

------------------------------------------------------------------------
Benchmark Run: Mon Dec 23 2019 12:37:52 - 12:40:02
4 CPUs in system; running 4 parallel copies of tests

System Call Overhead                        1341079.5 lps   (10.0 s, 7 samples)

System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0    1341079.5    894.1
                                                                   ========
System Benchmarks Index Score (Partial Only)                          894.1
yuichiro-naito commented 4 years ago

Benchmark machine spec:

yuichiro-naito commented 4 years ago

I built a single processor kernel from #1. Syscall benchmark speeds up as fast as OpenBSD-6.1. It seems that the reason of poor performance is not only mitigations for speculative execution vulnerabilities but something in multiprocessor kernel.

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: openbsd.local: OpenBSD
   OS: OpenBSD -- 6.6 -- GENERIC#0
   Machine: amd64 (unknown)
   Language: en_US.utf8 (charmap=, collate=)
   CPU: no details available
   1:00PM  59 secs, 1 user, load averages: 0.00, 0.00, 0.00; runlevel

------------------------------------------------------------------------
Benchmark Run: Tue Dec 24 2019 13:00:38 - 13:02:49
1 CPU in system; running 1 parallel copy of tests

System Call Overhead                        1894822.4 lps   (10.0 s, 7 samples)

System Benchmarks Partial Index              BASELINE       RESULT    INDEX
System Call Overhead                          15000.0    1894822.4   1263.2
                                                                   ========
System Benchmarks Index Score (Partial Only)                         1263.2
yuichiro-naito commented 4 years ago

Syscall benchmark loops following code.

           while (1) {
                close(dup(0));
                getpid();
                getuid();
                umask(022);
                iter++;
           }

The number of instructions in userland are very small. Most of execution are run in the kernel that is serialized by giant lock.

krytarowski commented 4 years ago

Do you insist on OpenBSD?

yasuoka commented 4 years ago

Do you insist on OpenBSD?

Yes, I emailed about this to developers' mailing list.