golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.98k stars 17.67k forks source link

runtime: reduce scheduling contention for large $GOMAXPROCS #2933

Closed gopherbot closed 9 years ago

gopherbot commented 12 years ago

by apachephp:

I run  benchmark against MySQL, there are multiple threads (goroutine)
connecting to MySQL and running short queries.

In this scenario go application uses most CPU resources,
and from profiling it seems that is related to Mutex usage in Net Read

Profiling:
(pprof) top20 -cum
Total: 528260 samples
       0   0.0%   0.0%   412464  78.1% runtime.initdone
      22   0.0%   0.0%   401608  76.0% main.sleepy
      22   0.0%   0.0%   400454  75.8% main.runQuery
     587   0.1%   0.1%   393491  74.5% github%2ecom/Philio/GoMySQL..getResult
    1155   0.2%   0.3%   370480  70.1% github%2ecom/Philio/GoMySQL..readPacket
      26   0.0%   0.3%   356381  67.5% github%2ecom/Philio/GoMySQL..StoreResult
     136   0.0%   0.4%   356135  67.4% github%2ecom/Philio/GoMySQL..getAllRows
     138   0.0%   0.4%   356026  67.4% github%2ecom/Philio/GoMySQL..getRow
     371   0.1%   0.5%   329772  62.4% io.ReadFull
     801   0.2%   0.6%   329470  62.4% io.ReadAtLeast
    1627   0.3%   0.9%   328820  62.2% net..Read
  317879  60.2%  61.1%   317879  60.2% runtime.futex
   23042   4.4%  65.5%   297638  56.3% syscall.Syscall
     524   0.1%  65.6%   293808  55.6% syscall.Read
   15311   2.9%  68.5%   243751  46.1% runtime.entersyscall
     973   0.2%  68.6%   236760  44.8% github%2ecom/Philio/GoMySQL..readNumber
      88   0.0%  68.7%   182475  34.5% futexwakeup
     316   0.1%  68.7%   179999  34.1% schedunlock
     166   0.0%  68.8%   162654  30.8% runtime.unlock
     168   0.0%  68.8%   162546  30.8% futexunlock

Is there way to use mutex for shorter period of time in such scenario ?
With such problem I can't use go application for benchmarking, as it consumes for CPU
than MySQL
rsc commented 12 years ago

Comment 1:

How are you profiling?  Are you using 6prof or the pprof package?
gopherbot commented 12 years ago

Comment 2 by apachephp:

pprof
and inside application:
                pprof.StartCPUProfile(f)
                defer pprof.StopCPUProfile()
rsc commented 12 years ago

Comment 3:

Can you run pprof --web >x.html and attach the html here?
gopherbot commented 12 years ago

Comment 4 by apachephp:

Can't generate html on server box, I attached .pdf

Attachments:

  1. 1.pdf (17861 bytes)
rsc commented 12 years ago

Comment 5:

What do you have GOMAXPROCS set to?
Try setting it higher.
How many active connections to the database do you have?
gopherbot commented 12 years ago

Comment 6 by apachephp:

runtime.GOMAXPROCS(16) 
because it is 16 cores box.
In this case I have 32 connections to database.
rsc commented 12 years ago

Comment 7:

Try 32 or 48 anyway; see if it helps.
The problem is that the scheduler is working very
hard to keep at most 16 goroutines running through
user space at a time, so as one goroutine goes to sleep
reading from a network connection it must wake up
another goroutine whose turn it is now to run.
If there were less contention (by having more slots)
then you wouldn't see all this scheduling activity.
Russ
gopherbot commented 12 years ago

Comment 8 by apachephp:

with runtime.GOMAXPROCS(32) 
the CPU usage decreased significantly.
I will run more tests and with more database connections.
gopherbot commented 12 years ago

Comment 9 by apachephp:

No, I was too quick on conclusion.
With both 
 runtime.GOMAXPROCS(32) 
and
 runtime.GOMAXPROCS(48) 
the problem did not go away,
and I still see big CPU usage by go application.
rsc commented 12 years ago

Comment 10:

The scheduler needs some work in this regard.
That will happen after Go 1.

Labels changed: added priority-later, removed priority-triage.

Owner changed to builder@golang.org.

Status changed to Accepted.

rsc commented 12 years ago

Comment 11:

Labels changed: added go1.1maybe.

dvyukov commented 11 years ago

Comment 12:

Owner changed to @dvyukov.

rsc commented 11 years ago

Comment 13:

[The time for maybe has passed.]

Labels changed: removed go1.1maybe.

gopherbot commented 11 years ago

Comment 14 by vadim@percona.com:

I wonder what is the status of this problem. Will it be fixed in 1.1?
Without it I can't use Go for high concurrent workloads.
bradfitz commented 11 years ago

Comment 15:

Download it and see: https://code.google.com/p/go/downloads/list?can=2&q=go1.1
Much scheduler stuff was changed.
dvyukov commented 11 years ago

Comment 16:

I believe this is fixed by the new scheduler and net poller.
If it's still an issue, please re-open the issue with updated profile and reproduction
instructions.

Status changed to Fixed.

majimboo commented 9 years ago

I am having a similar problem with the latest stable release of go. Each query coming in via a tcp packet command is using around 24% cpu (spikes at the query) then just drops back to 0% after the query. But that is really odd.

dvyukov commented 9 years ago

@majimboo please file a new issue with necessary details: what behavior you expect, what behavior you see, go/os/hardware details and reproduction instructions.