Open rubiksdot opened 6 months ago
Gah. This is what I get for doing this before I finish my coffee, on a Sunday.
Duplicate of #605
Are you sure. This is from reblance and the other was from openat syscalls - I know nothing about the intrenals ;)
I got this:
[31523.251432] ------------[ cut here ]------------
[31523.251448] btree trans held srcu lock (delaying memory reclaim) for 309 seconds
[31523.251530] WARNING: CPU: 2 PID: 25059 at fs/bcachefs/btree_iter.c:2871 bch2_trans_srcu_unlock+0x140/0x168 [bcachefs]
[31523.251727] Modules linked in: bcachefs lz4hc_compress lz4_compress xor xor_neon raid6_pq nft_chain_nat xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype overlay ledtrig_netdev r8169 cfg80211 rfkill raspberrypi_cpufreq 8021q garp mrp clk_raspberrypi reset_raspberrypi raspberrypi_hwmon bcm2711_thermal broadcom bcm_phy_lib pcie_brcmstb crct10dif_ce genet nvmem_rmem nls_iso8859_1 nls_cp437 uio_pdrv_genirq uio xt_conntrack ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog xt_tcpudp nft_compat nf_tables nfnetlink sch_fq_codel xt_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter veth tap macvlan bridge stp llc drm fuse backlight ip_tables x_tables dm_mod dax
[31523.251872] CPU: 2 PID: 25059 Comm: bch-rebalance/7 Not tainted 6.9.1 #1-NixOS
[31523.251879] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[31523.251883] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[31523.251890] pc : bch2_trans_srcu_unlock+0x140/0x168 [bcachefs]
[31523.252005] lr : bch2_trans_srcu_unlock+0x140/0x168 [bcachefs]
[31523.252107] sp : ffff800080b6b5f0
[31523.252110] x29: ffff800080b6b5f0 x28: 0000000000000000 x27: ffff09345d9281f8
[31523.252120] x26: ffff800080b6bb40 x25: ffff800080b6be30 x24: 0000000000000001
[31523.252129] x23: dead000000000100 x22: dead000000000122 x21: ffff800080b6be20
[31523.252139] x20: ffff09362acc0000 x19: ffff09356d360000 x18: ffffffffffffffff
[31523.252147] x17: 0000000000000020 x16: ffffb7337f0a9860 x15: ffff800080b6b220
[31523.252156] x14: ffffb733830455ad x13: ffffb733830455a1 x12: ffffb73382a53340
[31523.252165] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffb7337f145758
[31523.252174] x8 : c0000000ffffefff x7 : ffffb733829fb170 x6 : 0000000000000001
[31523.252182] x5 : ffff09363ee90d88 x4 : 0000000000000000 x3 : 0000000000000027
[31523.252191] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff09354117b540
[31523.252200] Call trace:
[31523.252204] bch2_trans_srcu_unlock+0x140/0x168 [bcachefs]
[31523.252308] bch2_trans_unlock_long+0x28/0x40 [bcachefs]
[31523.252409] bch2_moving_ctxt_do_pending_writes+0x6c/0x248 [bcachefs]
[31523.252515] bch2_data_update_init+0x458/0xea0 [bcachefs]
[31523.252617] bch2_move_extent+0x3f4/0x9c0 [bcachefs]
[31523.252721] do_rebalance_extent+0x230/0x608 [bcachefs]
[31523.252826] do_rebalance+0x268/0x730 [bcachefs]
[31523.252929] bch2_rebalance_thread+0x70/0xc0 [bcachefs]
[31523.253032] kthread+0xec/0xf8
[31523.253046] ret_from_fork+0x10/0x20
[31523.253054] ---[ end trace 0000000000000000 ]---
I have an 11 day copy (using cp) going on from a compressive btrfs filesystem to its replacement (background) compressing bcachefs filesystem. Both source and destination filesystems are on a spinning_rust.mdraid6.lvm stack. Kernel is 6.7.0.
Recently noticed the following in dmesg:
They happen repeatedly over the time period with various lenths for the amount of seconds the lock is held:
Biggest (related) question I have is: will this result in data loss. I probably have another 10 days to the copy left, so... :)