grobian / carbon-c-relay

Enhanced C implementation of Carbon relay, aggregator and rewriter
Apache License 2.0
380 stars 107 forks source link

Extreme CPU usage from 3.4 > 3.7 #438

Closed grinapo closed 2 years ago

grinapo commented 2 years ago

I just noticed that upgrading from 3.4 to 3.7 (from debian package) caused:

# strace -f -c -p 1841856
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 52.15   74.643804         103    717922    717113 read
 35.38   50.635251      335332       151           clock_nanosleep
  5.90    8.440374      562691        15           poll
  5.17    7.393707         128     57544           write
  0.98    1.408249      176031         8           restart_syscall
  0.42    0.597968         114      5234      5015 futex
  0.00    0.001087         217         5           close
  0.00    0.000770         154         5           accept
  0.00    0.000568         113         5           fcntl
  0.00    0.000504         100         5           getpeername
------ ----------- ----------- --------- --------- ----------------
100.00  143.122282         183    780894    722128 total

I'm pretty clueless, but due to the pressure I have to downgrade....

grobian commented 2 years ago

I'm not sure, if this is the version Debian ships, then that's really aweful, from the 3.7.1 release notes:

https://github.com/grobian/carbon-c-relay/releases/tag/v3.7.1

The previous 3.7 release was retracted. On Linux this release had CPU hogging behaviour coming from the new semaphore approch. This is the only fix made in 3.7.1.

grinapo commented 2 years ago

And here's the downgraded strace, same about 10s sample.

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 91.20   47.510397       54297       875           clock_nanosleep
  4.23    2.202359      220235        10           poll
  2.01    1.045229          19     53245           write
  1.91    0.993138          15     62147     61587 read
  0.58    0.302434       12601        24           restart_syscall
  0.08    0.041687          17      2327      2327 recvfrom
  0.00    0.000313          31        10         3 futex
  0.00    0.000139          46         3           close
  0.00    0.000075          25         3           accept
  0.00    0.000052          17         3           fcntl
  0.00    0.000034          11         3           getpeername
  0.00    0.000016           8         2           mprotect
------ ----------- ----------- --------- --------- ----------------
100.00   52.095873         439    118652     63917 total
grinapo commented 2 years ago

@grobian let me see, and if it is then I'll ping the maintainer. Good idea! Thanks!

grinapo commented 2 years ago

Indeed. I'll make some noise since current stable contains the wrong version. I believe this is fixed on your side. Thanks again!