qbittorrent / qBittorrent

qBittorrent BitTorrent client
https://www.qbittorrent.org
Other
27.03k stars 3.89k forks source link

qBittorrent does not respect cgroup memory limits, resulting in non-oom related kernel panic #15463

Open Rid opened 3 years ago

Rid commented 3 years ago

Bug report

Checklist

Description

qBittorrent info and operating system(s)

If on Linux, libtorrent-rasterbar and Qt versions

What is the problem

Qbittorrent does not respect cgroup memory limits, resulting in constantly being OOM killed.

Detailed steps to reproduce the problem

  1. Run qbittorrent in a docker container with memory limits
  2. Do some activity to fill RAM
  3. qbittorrent is OOM killed
  4. After some time of this repeating, it causes a kernel panic in ZFS for unknown reasons (see https://github.com/openzfs/zfs/issues/12543)

What is the expected behavior

qbittorrent to respect cgroup memory limits so as to not be stuck in an endless OOM kill loop.

Extra info (if any)

Kernel logs showing issue:

[381793.143939] qbittorrent-nox invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[381793.143943] CPU: 24 PID: 621243 Comm: qbittorrent-nox Kdump: loaded Tainted: P           OE     5.8.0-63-generic #71~20.04.1-Ubuntu
[381793.143944] Hardware name: Dell Inc. PowerEdge R730xd/0WCJNT, BIOS 2.13.0 05/14/2021
[381793.143945] Call Trace:
[381793.143957]  dump_stack+0x74/0x92
[381793.143963]  dump_header+0x4f/0x1eb
[381793.143965]  oom_kill_process.cold+0xb/0x10
[381793.143969]  out_of_memory.part.0+0x1df/0x430
[381793.143970]  out_of_memory+0x6d/0xd0
[381793.143975]  mem_cgroup_out_of_memory+0xbd/0xe0
[381793.143977]  try_charge+0x7ce/0x830
[381793.143980]  mem_cgroup_charge+0x88/0x210
[381793.143985]  do_anonymous_page+0x110/0x3b0
[381793.143998]  __handle_mm_fault+0x8da/0x930
[381793.144001]  handle_mm_fault+0xca/0x200
[381793.144008]  do_user_addr_fault+0x1e2/0x440
[381793.144011]  exc_page_fault+0x86/0x1b0
[381793.144016]  ? asm_exc_page_fault+0x8/0x30
[381793.144017]  asm_exc_page_fault+0x1e/0x30
[381793.144020] RIP: 0033:0xd7c276
[381793.144025] Code: Unable to access opcode bytes at RIP 0xd7c24c.
[381793.144026] RSP: 002b:00007f13f1d96bc0 EFLAGS: 00010202
[381793.144028] RAX: 0000000000000631 RBX: 00007f13e4000020 RCX: 00007f12588579d0
[381793.144029] RDX: 00007f12588539d0 RSI: 0000000000004014 RDI: 000000000186e4a0
[381793.144030] RBP: 0000000000004015 R08: 00007f1258000000 R09: 0000000000854000
[381793.144031] R10: 0000000000004030 R11: 0000000000000206 R12: 0000000000000640
[381793.144031] R13: 0000000000001000 R14: 00007f12588539c0 R15: 0000000000004030
[381793.144036] memory: usage 5871500kB, limit 5859372kB, failcnt 240884
[381793.144037] swap: usage 9128kB, limit 5859372kB, failcnt 0
[381793.144038] Memory cgroup stats for /system.slice/docker-8681a4e5824d2034d2dccba0817fd22fd076ac97478c8a5e8b918209d46f49b8.scope:
[381793.144051] anon 5924954112
                file 0
                kernel_stack 552960
                slab 43663360
                sock 21815296
                shmem 0
                file_mapped 675840
                file_dirty 0
                file_writeback 2568192
                anon_thp 0
                inactive_anon 1935790080
                active_anon 3989020672
                inactive_file 0
                active_file 0
                unevictable 0
                slab_reclaimable 8441856
                slab_unreclaimable 35221504
                pgfault 644854122
                pgmajfault 360327
                workingset_refault 248259
                workingset_activate 202521
                workingset_restore 199584
                workingset_nodereclaim 0
                pgrefill 3856985
                pgscan 55721762
                pgsteal 926100
                pgactivate 1714086
                pgdeactivate 3215594
                pglazyfree 0
                pglazyfreed 0
                thp_fault_alloc 0
                thp_collapse_alloc 0
[381793.144052] Tasks state (memory values in pages):
[381793.144052] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[381793.144055] [ 189327] 14695000 189327      278        0    28672        7             0 docker-init
[381793.144057] [ 192061] 14695000 192061       51        0    32768        4             0 s6-svscan
[381793.144059] [ 195551] 14695000 195551       51        0    32768        3             0 s6-supervise
[381793.144061] [ 202227] 14695000 202227       51        3    32768        2             0 s6-supervise
[381793.144062] [ 202228] 14695000 202228       51        0    32768        3             0 s6-supervise
[381793.144063] [ 202231] 14695000 202231      956       23    49152       37             0 cron
[381793.144065] [ 620814] 14696000 620814  1673992  1447025 12185600     2227             0 qbittorrent-nox
[381793.144067] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=docker-8681a4e5824d2034d2dccba0817fd22fd076ac97478c8a5e8b918209d46f49b8.scope,mems_allowed=0-1,oom_memcg=/system.slice/docker-8681a4e5824d2034d2dccba0817fd22fd076ac97478c8a5e8b918209d46f49b8.scope,task_memcg=/system.slice/docker-8681a4e5824d2034d2dccba0817fd22fd076ac97478c8a5e8b918209d46f49b8.scope,task=qbittorrent-nox,pid=620814,uid=14696000
[381793.144113] Memory cgroup out of memory: Killed process 620814 (qbittorrent-nox) total-vm:6695968kB, anon-rss:5788100kB, file-rss:0kB, shmem-rss:0kB, UID:14696000 pgtables:11900kB oom_score_adj:0
[381793.737318] oom_reaper: reaped process 620814 (qbittorrent-nox), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[388192.407750] qbittorrent-nox invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[388192.407754] CPU: 43 PID: 953856 Comm: qbittorrent-nox Kdump: loaded Tainted: P           OE     5.8.0-63-generic #71~20.04.1-Ubuntu
[388192.407755] Hardware name: Dell Inc. PowerEdge R730xd/0WCJNT, BIOS 2.13.0 05/14/2021
[388192.407755] Call Trace:
[388192.407764]  dump_stack+0x74/0x92
[388192.407768]  dump_header+0x4f/0x1eb
[388192.407770]  oom_kill_process.cold+0xb/0x10
[388192.407774]  out_of_memory.part.0+0x1df/0x430
[388192.407775]  out_of_memory+0x6d/0xd0
[388192.407779]  mem_cgroup_out_of_memory+0xbd/0xe0
[388192.407781]  try_charge+0x7ce/0x830
[388192.407784]  mem_cgroup_charge+0x88/0x210
[388192.407787]  do_anonymous_page+0x110/0x3b0
[388192.407790]  __handle_mm_fault+0x8da/0x930
[388192.407795]  ? __switch_to_xtra+0x119/0x510
[388192.407797]  handle_mm_fault+0xca/0x200
[388192.407802]  do_user_addr_fault+0x1e2/0x440
[388192.407805]  exc_page_fault+0x86/0x1b0
[388192.407808]  asm_exc_page_fault+0x1e/0x30
[388192.407812] RIP: 0010:copy_user_enhanced_fast_string+0xe/0x30
[388192.407815] Code: 89 d1 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 31 c0 0f 01 ca c3 0f 1f 80 00 00 00 00 0f 01 cb 83 fa 40 0f 82 70 ff ff ff 89 d1 <f3> a4 31 c0 0f 01 ca c3 66 2e 0f 1f 84 00 00 00 00 00 89 d1 f3 a4
[388192.407816] RSP: 0018:ffffac8a996cfb20 EFLAGS: 00050202
[388192.407817] RAX: 00007f3c93856140 RBX: ffffac8a996cfe08 RCX: 0000000000001140
[388192.407818] RDX: 0000000000002732 RSI: ffff9c526557d5f2 RDI: 00007f3c93855000
[388192.407819] RBP: ffffac8a996cfb28 R08: ffff9c526557c000 R09: 0000000000004000
[388192.407820] R10: ffff9c5a6ec6f640 R11: 0000000040000000 R12: 0000000000004000
[388192.407820] R13: 0000000000004000 R14: ffff9c3bcb806b20 R15: 0000000000002732
[388192.407830]  ? copyout+0x26/0x30
[388192.407832]  _copy_to_iter+0xa0/0x460
[388192.407835]  ? _cond_resched+0x19/0x30
[388192.407837]  ? mutex_lock+0x13/0x40
[388192.407913]  uiomove_iter+0x6f/0xf0 [zfs]
[388192.407967]  uiomove+0x25/0x30 [zfs]
[388192.408003]  dmu_read_uio_dnode+0xa5/0xf0 [zfs]
[388192.408039]  dmu_read_uio_dbuf+0x47/0x60 [zfs]
[388192.408094]  zfs_read+0x136/0x3b0 [zfs]
[388192.408146]  zpl_iter_read+0xd8/0x180 [zfs]
[388192.408150]  do_iter_readv_writev+0x18b/0x1b0
[388192.408152]  do_iter_read+0xe2/0x1a0
[388192.408154]  vfs_readv+0x6e/0xb0
[388192.408158]  ? __secure_computing+0x42/0xe0
[388192.408160]  do_preadv+0x93/0xd0
[388192.408162]  __x64_sys_preadv+0x21/0x30
[388192.408165]  do_syscall_64+0x49/0xc0
[388192.408167]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[388192.408169] RIP: 0033:0xdd46ba
[388192.408174] Code: Unable to access opcode bytes at RIP 0xdd4690.
[388192.408175] RSP: 002b:00007f3e12ffa230 EFLAGS: 00000246 ORIG_RAX: 0000000000000127
[388192.408176] RAX: ffffffffffffffda RBX: 00000000000000ef RCX: 0000000000dd46ba
[388192.408177] RDX: 0000000000000046 RSI: 00007f3e12ffa270 RDI: 00000000000000ef
[388192.408178] RBP: 00007f3e12ffa270 R08: 0000000000000000 R09: 0000000000000000
[388192.408179] R10: 00000000ece46732 R11: 0000000000000246 R12: 0000000000000046
[388192.408180] R13: 00000000ece46732 R14: 00000000ece46732 R15: 0000000000000000
[388192.408182] memory: usage 5859372kB, limit 5859372kB, failcnt 242280
[388192.408183] swap: usage 1524kB, limit 5859372kB, failcnt 0
[388192.408184] Memory cgroup stats for /system.slice/docker-8681a4e5824d2034d2dccba0817fd22fd076ac97478c8a5e8b918209d46f49b8.scope:
[388192.408197] anon 5931982848
                file 405504
                kernel_stack 405504
                slab 44507136
                sock 3346432
                shmem 0
                file_mapped 1081344
                file_dirty 0
                file_writeback 2568192
                anon_thp 0
                inactive_anon 1545781248
                active_anon 4384509952
                inactive_file 1093632
                active_file 0
                unevictable 0
                slab_reclaimable 8712192
                slab_unreclaimable 35794944
                pgfault 800376060
                pgmajfault 366432
                workingset_refault 254133
                workingset_activate 204072
                workingset_restore 200937
                workingset_nodereclaim 0
                pgrefill 4268647
                pgscan 55746979
                pgsteal 932463
                pgactivate 1740057
                pgdeactivate 3616509
                pglazyfree 0
                pglazyfreed 0
                thp_fault_alloc 0
                thp_collapse_alloc 0
[388192.408198] Tasks state (memory values in pages):
[388192.408198] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[388192.408202] [ 189327] 14695000 189327      278        0    28672        7             0 docker-init
[388192.408204] [ 192061] 14695000 192061       51        0    32768        4             0 s6-svscan
[388192.408206] [ 195551] 14695000 195551       51        0    32768        3             0 s6-supervise
[388192.408207] [ 202227] 14695000 202227       51        0    32768        4             0 s6-supervise
[388192.408208] [ 202228] 14695000 202228       51        0    32768        3             0 s6-supervise
[388192.408210] [ 202231] 14695000 202231      956       23    49152       37             0 cron
[388192.408212] [ 944961] 14696000 944961  1648597  1448695 11849728      325             0 qbittorrent-nox
[388192.408213] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=docker-8681a4e5824d2034d2dccba0817fd22fd076ac97478c8a5e8b918209d46f49b8.scope,mems_allowed=0-1,oom_memcg=/system.slice/docker-8681a4e5824d2034d2dccba0817fd22fd076ac97478c8a5e8b918209d46f49b8.scope,task_memcg=/system.slice/docker-8681a4e5824d2034d2dccba0817fd22fd076ac97478c8a5e8b918209d46f49b8.scope,task=qbittorrent-nox,pid=944961,uid=14696000
[388192.408254] Memory cgroup out of memory: Killed process 944961 (qbittorrent-nox) total-vm:6594388kB, anon-rss:5794780kB, file-rss:0kB, shmem-rss:0kB, UID:14696000 pgtables:11572kB oom_score_adj:0
[388192.415215] usercopy: Kernel memory exposure attempt detected from SLUB object 'zio_data_buf_16384' (offset 13794, size 18974)!
[388192.421454] ------------[ cut here ]------------
[388192.421457] kernel BUG at mm/usercopy.c:99!
[388192.424487] invalid opcode: 0000 [#1] SMP PTI
[388192.427467] CPU: 23 PID: 953856 Comm: qbittorrent-nox Kdump: loaded Tainted: P           OE     5.8.0-63-generic #71~20.04.1-Ubuntu
[388192.433431] Hardware name: Dell Inc. PowerEdge R730xd/0WCJNT, BIOS 2.13.0 05/14/2021
[388192.436367] RIP: 0010:usercopy_abort+0x7b/0x7d
[388192.439246] Code: 4c 0f 45 de 51 4c 89 d1 48 c7 c2 9d 47 7e 8b 57 48 c7 c6 e0 f4 7c 8b 48 c7 c7 68 48 7e 8b 48 0f 45 f2 4c 89 da e8 23 7c ff ff <0f> 0b 4c 89 e1 49 89 d8 44 89 ea 31 f6 48 29 c1 48 c7 c7 df 47 7e
[388192.445082] RSP: 0018:ffffac8a996cfb50 EFLAGS: 00010246
[388192.447929] RAX: 0000000000000073 RBX: 0000000000004a1e RCX: 0000000000000000
[388192.450734] RDX: 0000000000000000 RSI: ffff9c5a7fad8cd0 RDI: ffff9c5a7fad8cd0
[388192.453854] RBP: ffffac8a996cfb68 R08: ffff9c5a7fad8cd0 R09: ffffac89ccc41020
[388192.456636] R10: ffff9c4a71531680 R11: 0000000000000001 R12: ffff9c3e3af035e2
[388192.459360] R13: 0000000000000001 R14: ffff9c3e3af08000 R15: 0000000000000000
[388192.461896] FS:  00007f3e12ffd700(0000) GS:ffff9c5a7fac0000(0000) knlGS:0000000000000000
[388192.464385] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[388192.466789] CR2: 00007f3c93857000 CR3: 0000000727adc002 CR4: 00000000003606e0
[388192.469380] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[388192.471956] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[388192.474533] usercopy: Kernel memory exposure attempt detected from SLUB object 'zio_data_buf_16384' (offset 3440, size 29328)!
[388192.475437] Call Trace:
[388192.475450]  __check_heap_object+0xe6/0x120
[388192.475456]  __check_object_size+0x13f/0x150
[388192.475577]  uiomove_iter+0x61/0xf0 [zfs]
[388192.475671]  uiomove+0x25/0x30 [zfs]
[388192.485162] ------------[ cut here ]------------
[388192.487899]  dmu_read_uio_dnode+0xa5/0xf0 [zfs]
[388192.487973]  dmu_read_uio_dbuf+0x47/0x60 [zfs]
[388192.492489] kernel BUG at mm/usercopy.c:99!
[388192.494648]  zfs_read+0x136/0x3b0 [zfs]
[388192.506585]  zpl_iter_read+0xd8/0x180 [zfs]
[388192.507824]  do_iter_readv_writev+0x18b/0x1b0
[388192.508929]  do_iter_read+0xe2/0x1a0
[388192.510053]  vfs_readv+0x6e/0xb0
[388192.511141]  ? __secure_computing+0x42/0xe0
[388192.512184]  do_preadv+0x93/0xd0
[388192.513236]  __x64_sys_preadv+0x21/0x30
[388192.514236]  do_syscall_64+0x49/0xc0
[388192.515223]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[388192.516206] RIP: 0033:0xdd46ba
[388192.517176] Code: Unable to access opcode bytes at RIP 0xdd4690.
[388192.518125] RSP: 002b:00007f3e12ffa230 EFLAGS: 00000246 ORIG_RAX: 0000000000000127
[388192.519066] RAX: ffffffffffffffda RBX: 00000000000000ef RCX: 0000000000dd46ba
[388192.519995] RDX: 0000000000000046 RSI: 00007f3e12ffa270 RDI: 00000000000000ef
[388192.520891] RBP: 00007f3e12ffa270 R08: 0000000000000000 R09: 0000000000000000
[388192.521761] R10: 00000000ece46732 R11: 0000000000000246 R12: 0000000000000046
[388192.522600] R13: 00000000ece46732 R14: 00000000ece46732 R15: 0000000000000000
[388192.523437] Modules linked in: wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha libblake2s_generic xt_multiport act_mirred cls_u32 sch_ingress sch_hfsc veth nf_conntrack_netlink nfnetlink xfrm_user bridge stp llc sch_fq_codel aufs overlay xt_MASQUERADE xt_nat xt_addrtype iptable_nat nf_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_state xt_conntrack binfmt_misc iptable_filter bpfilter intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl intel_cstate zfs(POE) mgag200 zunicode(POE) drm_kms_helper zzstd(OE) cec zlua(OE) rc_core zavl(POE) icp(POE) i2c_algo_bit zcommon(POE) fb_sys_fops znvpair(POE) syscopyarea spl(OE) sysfillrect sysimgblt mei_me mei dcdbas ipmi_ssif mxm_wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter mac_hid tcp_bbr sch_fq
[388192.523476]  nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ifb drm ip_tables x_tables autofs4 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear raid10 raid1 ses enclosure scsi_transport_sas ixgbe xfrm_algo ahci lpc_ich dca crc32_pclmul tg3 megaraid_sas mdio libahci wmi

Attachments

Rid commented 3 years ago

I'm currently testing disabling the OOM killer for the container, I'm not sure how qBittorrent will handle malloc failing, but I'll reply here with the result.

CordySmith commented 3 years ago

A random question from a non-developer:

My understanding of out-of-memory conditions in Linux, is that a process requesting memory aggressively enough can trigger a kernel panic, even in the presence of the OOM killer if there is also IO activity (disk/network), as IO will often trigger kalloc calls that will kernel panic on fail, and the OOM killer can take a few milliseconds to respond to an oversize process. The trick is to ensure that the Kernel always has enough free memory so kalloc is unlikely to fail. This can be done by setting hard limits on individual process size using RLIMITs - i.e. RLIMIT_AS.

Setting RLIMIT_AS to some value lower than the total amount of memory available to the VM will cause qBittorrent's call to malloc to fail, and trigger whatever error handling qbittorrent has in place to handle that happening (best case it would likely exit immediately) - with a reasonable amount of elbow room (say 1gb), the kernel should be able to clean-up whatever the resulting IO operations are without panicing.

With respect to the issue: how should qbittorrent handle approaching a cgroup limit ? Refusing to add more torrents ? Removing existing torrents ? Exiting immediately ? Ultimately, to do work, qbittorrent needs to allocate memory, it's possible to configure it to use less memory - but it's very hard to predict up-front how much memory a given action will require. At some point QBittorrent would have to say: "I'm too close to a cgroup limit - I refuse to do that", even though the action was probably safe. To reduce the expected behavior to absurdity, one solution is for QBittorrent to refuse to start in the presence of cgroup limits.

Rid commented 2 years ago

@CordySmith in docker cgroups v2 are set using the systemd driver such that they govern all processes within a systemd slice under one set of parameters.

There is no kernel memory controller in cgroups v2, so it's not possible to limit it.

It is possible to set memory.high in cgroups v2 without memory.max which disables the OOM killer and when usage goes over the high boundary, the processes are throttled and put under heavy reclaim pressure, so that could be another possible solution with memory.max used which a high value as a failsafe.

Currently we're moving back to cgroups v1 and disabling the OOM killer which will hopefully solve the issue until we or someone else can make PR in docker.

Rid commented 2 years ago

OOM kill was disabled and the kernel panic still persists:

[74173.465416] usercopy: Kernel memory exposure attempt detected from SLUB object 'zio_buf_comb_16384' (offset 15632, size 17136)!
[74173.465516] ------------[ cut here ]------------
[74173.465518] kernel BUG at mm/usercopy.c:99!
[74173.465555] invalid opcode: 0000 [#1] SMP PTI
[74173.465587] CPU: 0 PID: 1601931 Comm: qbittorrent-nox Kdump: loaded Tainted: P           OE     5.8.0-63-generic #71~20.04.1-Ubuntu
[74173.465655] Hardware name: Dell Inc. PowerEdge R730xd/0H21J3, BIOS 2.11.0 11/02/2019
[74173.465707] RIP: 0010:usercopy_abort+0x7b/0x7d
[74173.465738] Code: 4c 0f 45 de 51 4c 89 d1 48 c7 c2 9d 47 7e 85 57 48 c7 c6 e0 f4 7c 85 48 c7 c7 68 48 7e 85 48 0f 45 f2 4c 89 da e8 23 7c ff ff <0f> 0b 4c 89 e1 49 89 d8 44 89 ea 31 f6 48 29 c1 48 c7 c7 df 47 7e
[74173.465864] RSP: 0018:ffffb0e06dce7b50 EFLAGS: 00010246
[74173.465903] RAX: 0000000000000073 RBX: 00000000000042f0 RCX: 0000000000000000
[74173.465957] RDX: 0000000000000000 RSI: ffff8d533f818cd0 RDI: ffff8d533f818cd0
[74173.466006] RBP: ffffb0e06dce7b68 R08: ffff8d533f818cd0 R09: ffffb0e0095d4020
[74173.466048] R10: ffff8d53301cda30 R11: 0000000000000001 R12: ffff8d5fa1ae3d10
[74173.466097] R13: 0000000000000001 R14: ffff8d5fa1ae8000 R15: 0000000000000000
[74173.466147] FS:  00007f66fcff9700(0000) GS:ffff8d533f800000(0000) knlGS:0000000000000000
[74173.466202] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[74173.466255] CR2: 00007f66e0fb5000 CR3: 0000000e491d6001 CR4: 00000000001606f0
[74173.466310] Call Trace:
[74173.466336]  __check_heap_object+0xe6/0x120
[74173.466370]  __check_object_size+0x13f/0x150
[74173.466527]  zfs_uiomove_iter+0x61/0xf0 [zfs]
[74173.466649]  zfs_uiomove+0x25/0x30 [zfs]
[74173.466766]  dmu_read_uio_dnode+0xa5/0xf0 [zfs]
[74173.466887]  ? zfs_rangelock_enter_impl+0x271/0x5c0 [zfs]
[74173.466982]  dmu_read_uio_dbuf+0x47/0x60 [zfs]
[74173.467105]  zfs_read+0x136/0x3a0 [zfs]
[74173.467227]  zpl_iter_read+0xd8/0x180 [zfs]
[74173.467268]  do_iter_readv_writev+0x18b/0x1b0
[74173.467320]  do_iter_read+0xe2/0x1a0
[74173.467349]  vfs_readv+0x6e/0xb0
[74173.467377]  ? __secure_computing+0x42/0xe0
[74173.469480]  do_preadv+0x93/0xd0
[74173.471495]  __x64_sys_preadv+0x21/0x30
[74173.473578]  do_syscall_64+0x49/0xc0
[74173.475551]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[74173.477889] RIP: 0033:0xdd46ba
[74173.479915] Code: Unable to access opcode bytes at RIP 0xdd4690.
[74173.481937] RSP: 002b:00007f66fcff53b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000127
[74173.483834] RAX: ffffffffffffffda RBX: 0000000000000938 RCX: 0000000000dd46ba
[74173.485876] RDX: 0000000000000080 RSI: 00007f66fcff53f0 RDI: 0000000000000938
[74173.487623] RBP: 00007f66fcff53f0 R08: 0000000000000000 R09: 0000000000000000
[74173.489277] R10: 00000000005dfb90 R11: 0000000000000246 R12: 0000000000000080
[74173.490939] R13: 00000000005dfb90 R14: 00000000005dfb90 R15: 0000000000000000
[74173.492584] Modules linked in: wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha libblake2s_generic xt_multiport act_mirred cls_u32 sch_ingress sch_hfsc veth nf_conntrack_netlink nfnetlink xfrm_user sch_fq_codel bridge stp llc aufs overlay xt_MASQUERADE xt_nat binfmt_misc xt_addrtype iptable_nat nf_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_state xt_conntrack iptable_filter bpfilter zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl intel_cstate mgag200 drm_kms_helper cec rc_core i2c_algo_bit fb_sys_fops syscopyarea sysfillrect mei_me mxm_wmi dcdbas sysimgblt mei ipmi_si ipmi_devintf ipmi_msghandler mac_hid acpi_power_meter tcp_bbr sch_fq
[74173.492634]  nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ifb drm ip_tables x_tables autofs4 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear raid10 raid1 ses enclosure scsi_transport_sas ixgbe ahci xfrm_algo dca lpc_ich libahci crc32_pclmul tg3 mdio megaraid_sas wmi

It looks like qbittorrent is triggering a bug in the kernel, I'm not sure if this is ZFS specific or not.

luzpaz commented 10 months ago

Still relevant ?

jonboy345 commented 7 months ago

I'm still having these issues. Yes, still a problem.

WillGunn commented 7 months ago

I'm also having this issue on 4.6.3, with the memory limit in advanced in qbit set to 1024, the container's memory usage grows well beyond that image for the above container, I'm using these memory commmands --memory=10g --memory-reservation=4g

WillGunn commented 7 months ago

Potentially related, some of my fastresume files have odd sections around pieces and piece priority, for example: image

Some of these files were originally being seeded on a windows client, then I moved them to a linux docker container. I wrote a tool to parse the bencoded fastresume file to bulk change the save_path and qBt-savePath fields, leaving everything else alone, so I didn't need to manually change the destination in the client and do a recheck

glassez commented 7 months ago

Potentially related, some of my fastresume files have odd sections around pieces and piece priority, for example

Do you have "first and last piece priority" enabled for these torrents?

WillGunn commented 7 months ago

Potentially related, some of my fastresume files have odd sections around pieces and piece priority, for example

Do you have "first and last piece priority" enabled for these torrents?

Potentially, but through the web-ui I am unable to see that setting on completed torrents.