dom96 / nimbuild

Nim's build farm
http://build.nim-lang.org
12 stars 3 forks source link

CPU usage spikes after network failures #6

Closed gradha closed 11 years ago

gradha commented 11 years ago

I had the builder running along without problems and after a while it got 100% hungry again. The recent log in the console did showed the following, just before I pressed Ctrl+C:


Got message from hub: { "ping": "1362937403.427957"}
Replying to Ping
We seem to be timing out! PINGing server.
Server has not replied with a pong in 5 seconds.
nodename nor servname provided, or not known
We seem to be timing out! PINGing server.
Socket is not connected
Disconnected from server due to ^^
nodename nor servname provided, or not known
We seem to be timing out! PINGing server.
Socket is not connected
Disconnected from server due to ^^
nodename nor servname provided, or not known
We seem to be timing out! PINGing server.
Socket is not connected
Disconnected from server due to ^^
nodename nor servname provided, or not known
^CTraceback (most recent call last)
builder.nim(858)         builder
asyncio.nim(542)         poll
gc.nim(411)              newSeq
gc.nim(404)              newObj
gc.nim(380)              rawNewObj
gc.nim(883)              collectCT
gc.nim(860)              collectCTBody
gc.nim(819)              CollectZCT
gc.nim(318)              forAllChildren
SIGINT: Interrupted by Ctrl-C.

In a separate test I started the builder, it connected, and then I turned off wifi. At the first event of ping timeout the screen log displayed:

We seem to be timing out! PINGing server.
Server has not replied with a pong in 5 seconds.
nodename nor servname provided, or not known

Instantly after that message appeared the cpu usage of builder went to 100%. I used Ctrl+C which showed a similar effect as the previous stack trace. It seems to be weird that in both cases objects are being allocated at the very same time I abort, isn't it?

^CTraceback (most recent call last)
builder.nim(858)         builder
asyncio.nim(542)         poll
gc.nim(411)              newSeq
gc.nim                   newObj
SIGINT: Interrupted by Ctrl-C.
gradha commented 11 years ago

I tried reproducing the issue by disabling wifi after a successful connection but this time the builder quited. On a second try, however, I was able to reproduce the problem. This time the CPU spiked after the timeout, I enabled wifi again and when it received another ping the CPU went back to 0%:

Started builder: built at 2013-03-08 23:22:38
The hub accepted me!
We seem to be timing out! PINGing server.
Server has not replied with a pong in 5 seconds.
nodename nor servname provided, or not known
We seem to be timing out! PINGing server.
Socket is not connected
Disconnected from server due to ^^
The hub accepted me!
Got message from hub: { "ping": "1362941572.867273"}
Replying to Ping

The macosx monitor also has a brief inspector, I put it on the builder and took two screenshots separated by only eight seconds:

http://dl.dropbox.com/u/145894/t/builder%202013-03-10%20a%20la%28s%29%2019.51.00.png http://dl.dropbox.com/u/145894/t/builder%202013-03-10%20a%20la%28s%29%2019.51.08.png

Note how during this brief period of time the amount of unix system calls increased greatly. When the builder is running with 0% cpu the amount of unix system calls increases by one every second.