golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.95k stars 17.66k forks source link

net/http: continue to reduce server CPU and allocations #5195

Closed bradfitz closed 8 years ago

bradfitz commented 11 years ago
Tracking bug for continuing to improve CPU and memory allocations in the net/http server.

I was just starting to have fun when it was time to ship Go 1.1.

Here's where we're at:

~/go/src/pkg/net/http$ hg id
8153882a36e8 tip

~/go/src/pkg/net/http$ go test -c

~/go/src/pkg/net/http$ ./http.test -test.v -test.run=none -test.bench=ServerHandler
-test.cpuprofile=prof.cpu -test.benchtime=4s
PASS
BenchmarkServerHandlerTypeLen    1000000         11098 ns/op        1878 B/op         19 allocs/op
BenchmarkServerHandlerNoLen  1000000         10249 ns/op        1843 B/op         17 allocs/op
BenchmarkServerHandlerNoType     1000000         10620 ns/op        1844 B/op         17 allocs/op
BenchmarkServerHandlerNoHeader   1000000          8674 ns/op        1051 B/op         12 allocs/op

4 CPUs:
processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model       : 37
model name  : Intel(R) Core(TM) i5 CPU       M 560  @ 2.67GHz
stepping    : 5
microcode   : 0x2
cpu MHz     : 2667.000
cache size  : 3072 KB

(pprof) top25
Total: 3883 samples
     319   8.2%   8.2%      319   8.2% runtime.memmove
     133   3.4%  11.6%      171   4.4% time.nextStdChunk
     113   2.9%  14.6%      415  10.7% time.Time.Format
      93   2.4%  16.9%     1130  29.1% net/http.(*chunkWriter).writeHeader
      87   2.2%  19.2%      616  15.9% runtime.mallocgc
      86   2.2%  21.4%      111   2.9% sweepspan
      82   2.1%  23.5%      114   2.9% scanblock
      81   2.1%  25.6%      213   5.5% runtime.MCache_Alloc
      81   2.1%  27.7%       87   2.2% runtime.settype_flush
      78   2.0%  29.7%       78   2.0% net/textproto.CanonicalMIMEHeaderKey
      72   1.9%  31.5%       72   1.9% net/textproto.canonicalMIMEHeaderKey
      72   1.9%  33.4%       72   1.9% runtime.markallocated
      69   1.8%  35.2%       80   2.1% itab
      68   1.8%  36.9%       68   1.8% runtime.xchg
      58   1.5%  38.4%       58   1.5% runtime.memclr
      57   1.5%  39.9%      272   7.0% hash_insert
      56   1.4%  41.3%     1409  36.3% net/http.ReadRequest
      51   1.3%  42.6%       60   1.5% time.appendUint
      48   1.2%  43.9%      205   5.3% net/http.readTransfer
      47   1.2%  45.1%       47   1.2% runtime.appendstr
      42   1.1%  46.2%       42   1.1% bytes.IndexByte
      39   1.0%  47.2%      218   5.6% bufio.(*Writer).Write
      39   1.0%  48.2%       44   1.1% hash_init
      38   1.0%  49.2%       50   1.3% hash_remove
      37   1.0%  50.1%      209   5.4% runtime.copy

(pprof) top25 --cum
Total: 3883 samples
       0   0.0%   0.0%     3594  92.6% gosched0
      23   0.6%   0.6%     3593  92.5% net/http.(*conn).serve
      20   0.5%   1.1%     1593  41.0% net/http.(*conn).readRequest
      56   1.4%   2.5%     1409  36.3% net/http.ReadRequest
      13   0.3%   2.9%     1318  33.9% net/http.(*response).finishRequest
      20   0.5%   3.4%     1240  31.9% bufio.(*Writer).Flush
      10   0.3%   3.7%     1222  31.5% net/http.(*switchWriter).Write
      17   0.4%   4.1%     1207  31.1% net/http.(*chunkWriter).Write
      93   2.4%   6.5%     1130  29.1% net/http.(*chunkWriter).writeHeader
      13   0.3%   6.8%      627  16.1% net/http.serverHandler.ServeHTTP
      87   2.2%   9.1%      616  15.9% runtime.mallocgc
      11   0.3%   9.3%      607  15.6% net/http_test.func·112
       6   0.2%   9.5%      603  15.5% net/http.HandlerFunc.ServeHTTP
      27   0.7%  10.2%      514  13.2% net/textproto.(*Reader).ReadMIMEHeader
      24   0.6%  10.8%      431  11.1% net/http.(*response).Write
     113   2.9%  13.7%      415  10.7% time.Time.Format
      28   0.7%  14.4%      332   8.6% runtime.new
      16   0.4%  14.9%      331   8.5% runtime.mapassign
     319   8.2%  23.1%      319   8.2% runtime.memmove
      15   0.4%  23.5%      304   7.8% net/http.(*response).WriteHeader
      14   0.4%  23.8%      290   7.5% runtime.mapassign1
      57   1.5%  25.3%      272   7.0% hash_insert
       4   0.1%  25.4%      249   6.4% net/http_test.func·108
      34   0.9%  26.3%      243   6.3% net/http.Header.WriteSubset
       8   0.2%  26.5%      235   6.1% net/http.Header.clone

And an svg/png of CPU attached.

I'll attach a CPU one later with GOGC=off.

Attachments:

  1. http.svg (104237 bytes)
  2. http.png (709313 bytes)
bradfitz commented 11 years ago

Comment 1:

See issue #1642 (Set capacity of a slice) and https://golang.org/cl/8179043/ to
help ReadRequest.
bradfitz commented 11 years ago

Comment 2:

Patch attached to reduce some more. I'll send it out, broken up, after Go 1.1.

Attachments:

  1. less.patch (4350 bytes)
bradfitz commented 11 years ago

Comment 3:

Patch to address one of the TODOs under the "Life of a Write" comment, bypassing the
chunkWriter in some cases.  Profiling didn't show any/much improvement, though.  I
didn't finish investigating why not.  It should show a bit.

Attachments:

  1. skip-chunk-writer.diff (1571 bytes)
bradfitz commented 11 years ago

Comment 4:

For after Go 1.1: https://golang.org/cl/8478044 (part of issue #5192, but a
dependency here)
bradfitz commented 11 years ago

Comment 5:

Work-in-progress playing during bus rides is at https://golang.org/cl/8543043
Currently at:
BenchmarkServerHandlerNoHeader   1000000         10256 ns/op         791 B/op          8 allocs/op
bradfitz commented 11 years ago

Comment 6:

Labels changed: added garbage.

bradfitz commented 11 years ago

Comment 7:

This issue was updated by revision 0af302f50745b93e90a4507993a555d246acef4.

R=r
CC=gobot, golang-dev
https://golang.org/cl/8478044
bradfitz commented 11 years ago

Comment 8:

This issue was updated by revision d4cbc80d106a3f3b53631aa60b400c790b14bb5.

R=nigeltao
CC=golang-dev
https://golang.org/cl/9432046
bradfitz commented 11 years ago

Comment 9:

This issue was updated by revision 27f7427995782bf60195ca65fd9b44aa34913b7.

R=golang-dev, daniel.morsing
CC=golang-dev
https://golang.org/cl/9492044
bradfitz commented 11 years ago

Comment 10:

These are mostly all in now.
All that remains is: https://golang.org/cl/9584043 but the extra channel
operations end up not being worth the reduced GC time.
If a sync.Cache goes in (issue #4720), then 9584043 might be worth it.
bradfitz commented 11 years ago

Comment 11:

This issue was updated by revision 509a1173a33d4fd914409cd941470d440fd5eed.

R=r
CC=gobot, golang-dev
https://golang.org/cl/8478044
»»»
R=r
CC=golang-dev
https://golang.org/cl/9462049
rsc commented 10 years ago

Comment 12:

Labels changed: added go1.3maybe.

rsc commented 10 years ago

Comment 13:

Labels changed: added release-none, removed go1.3maybe.

rsc commented 10 years ago

Comment 14:

Labels changed: added repo-main.

odeke-em commented 8 years ago

@bradfitz I was just looking through this issue and decided to re-run your initial test, there was one crash

$ git log | head -n 1 && go version
commit c2ef005486fa05cbeddc74c7faac8f6a867703d4
go version devel +c2ef005 Thu Dec 10 04:43:08 2015 +0000 darwin/amd64
$ ./http.test -test.v -test.run=none -test.bench=ServerHandler -test.cpuprofile=prof.cpu -test.benchtime=4s
PASS
BenchmarkServerHandlerTypeLen-4      1000000          5549 ns/op        1858 B/op         17 allocs/op
BenchmarkServerHandlerNoLen-4        1000000          5385 ns/op        1826 B/op         14 allocs/op
BenchmarkServerHandlerNoType-4       1000000          5148 ns/op        1826 B/op         15 allocs/op
BenchmarkServerHandlerNoHeader-4     2000000          4358 ns/op        1041 B/op          9 allocs/op
$
$ # Second run because ./http.test -test.v -test.run=none -test.bench=ServerHandler -test.cpuprofile=prof.cpu -test.benchtime=4s only runs clause with `-test.v -test.run=none -test.bench=ServerHandler`
$ ./http.test -test.cpuprofile=prof.cpu -test.benchtime=4s
--- FAIL: TestTransportConcurrency (0.11s)
    transport_test.go:1216: error on req request-465: Get http://127.0.0.1:55568/?echo=request-465: write tcp 127.0.0.1:55825->127.0.0.1:55568: write: socket is not connected
FAIL

Otherwise this looks like an improvement

Old

BenchmarkServerHandlerTypeLen    1000000         11098 ns/op        1878 B/op         19 allocs/op
BenchmarkServerHandlerNoLen      1000000         10249 ns/op        1843 B/op         17 allocs/op
BenchmarkServerHandlerNoType     1000000         10620 ns/op        1844 B/op         17 allocs/op
BenchmarkServerHandlerNoHeader   1000000         8674 ns/op         1051 B/op         12 allocs/op

New

BenchmarkServerHandlerTypeLen-4      1000000          5549 ns/op        1858 B/op         17 allocs/op
BenchmarkServerHandlerNoLen-4        1000000          5385 ns/op        1826 B/op         14 allocs/op
BenchmarkServerHandlerNoType-4       1000000          5148 ns/op        1826 B/op         15 allocs/op
BenchmarkServerHandlerNoHeader-4     2000000          4358 ns/op        1041 B/op          9 allocs/op
bradfitz commented 8 years ago

@odeke-em, this bug is really old. I think it's probably just time to close it. We can open new specific bugs as needed.

odeke-em commented 8 years ago

Roger that, thanks!