crazy-max / docker-rtorrent-rutorrent

rTorrent and ruTorrent Docker image
MIT License
483 stars 106 forks source link

100% CPU usage during tracker timeouts #329

Closed nolemons closed 5 months ago

nolemons commented 5 months ago

Support guidelines

I've found a bug and checked that ...

Description

When a sufficient number of torrents in my client error with the status Tracker Status: Tracker: [Timeout was reached], I see constant 100% CPU usage (single core) from the rtorrent process in the container.

Expected behaviour

Tracker timeouts don't cause 100% CPU usage unnecessarily.

Actual behaviour

Tracker timeouts cause 100% CPU usage.

Steps to reproduce

  1. Run this docker image with a reasonable number of torrents. For context I have 10k+ across a few containers. I have seen this issue with timeouts in only a couple hundred torrents, though
  2. Cause tracker timeouts, either by tracker downtime, or by disrupting the network in a way that you get tracker timeouts. In my case, my VPN was having issues, but I've seen this in the past with normal tracker downtime.
  3. Observe 100% CPU usage

Docker info

I run this container in k8s via crio - if it's relevant, `sudo crictl info` returns:

{
  "status": {
    "conditions": [
      {
        "type": "RuntimeReady",
        "status": true,
        "reason": "",
        "message": ""
      },
      {
        "type": "NetworkReady",
        "status": true,
        "reason": "",
        "message": ""
      }
    ]
  },
  "config": {
    "sandboxImage": "registry.k8s.io/pause:3.9"
  }
}

### Docker Compose config

_No response_

### Logs

```text
There doesn't seem to be anything of note in the container logs or in the rtorrent/rutorrent logs.

Additional info

This looks quite similar to a bug with rtorrent that was caused by a regression in curl, specifically https://github.com/rakshasa/rtorrent/issues/951 / https://github.com/rakshasa/rtorrent/issues/580 / https://github.com/rakshasa/rtorrent/issues/457. However, this seems to be from a very old version of curl, it looks like curl in the container currently is v8.5.0, and this was an issue back around v7.55.0.

I straced the process and got the following output:

>sudo strace -c -p 3290243 
strace: Process 3290243 attached
^Cstrace: Process 3290243 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.06    1.394528           2    570092           epoll_pwait
  3.42    0.051288       17096         3           futex
  1.83    0.027467           3      9076           statfs
  0.45    0.006774           8       825       825 connect
  0.31    0.004694           2      1998           epoll_ctl
  0.22    0.003364           4       825           socket
  0.18    0.002629           4       630           close
  0.17    0.002493           1      1281           fcntl
  0.16    0.002380           1      1226           setsockopt
  0.06    0.000968           2       424           getsockname
  0.06    0.000849           2       424           poll
  0.05    0.000705          11        64           getdents64
  0.01    0.000129           4        32           open
  0.01    0.000106           6        16           mmap
  0.00    0.000054          18         3           munmap
  0.00    0.000020           2         8         8 stat
  0.00    0.000013           4         3           recvfrom
------ ----------- ----------- --------- --------- ----------------
100.00    1.498461           2    586930       833 total

This does look pretty similar, although it is a different syscall (epoll_pwait instead of epoll_wait). I've tried poking around for recent bug reports here but have been unable to find anything. I haven't tried messing around with curl, but will likely see try to see if somehow this is a similar issue. Frankly I'm not the most familiar with this stuff so I wouldn't really know what I'm doing, and mostly wanted to file this issue to see if anyone else has ran into this issue and found a resolution.

I did just find https://github.com/rakshasa/rtorrent/issues/1208 which claims there may be an issue in recent versions of curl here - I'm going to try changing the versions around and see if I'm able to resolve this. edit: nope, v7.84.0 which is mentioned in that thread doesn't seem to resolve it. That looks to be gentoo specific anyways, so it was really a shot in the dark anyways.

stickz commented 5 months ago

Upgrade to docker edge to resolve this problem.

nolemons commented 4 months ago

@stickz Sorry I should have followed up earlier here - I didn't realize what you meant by "edge" but I saw it's a tag. I've just gone ahead and set it but I'm still getting the same issue.

Out of curiosity, was there a particular change in the edge changes that you were expecting to fix this? If so I'd be curious to learn more and see if there's anything I can dig into there.

bakugo commented 3 months ago

Can confirm, still happens on the latest edge. Please reopen.

stickz commented 3 months ago

@nolemons This docker container was just switched to rTorrent stickz. https://github.com/stickz/rtorrent

Could you retest with 4.3.5-3.2 and file an issue report there if the problem persists? We can then discuss next steps.