Closed dpendolino closed 11 months ago
looks similiar / identical to https://github.com/openzfs/zfs/issues/15466 and https://github.com/openzfs/zfs/issues/15485
I've tried to assert all overflows I could guess in replay in https://github.com/openzfs/zfs/pull/15517 . Would be good if somebody could try reproducing it with debug ZFS build and may be my patch.
@amotin I'm happy to repro with a debug build if someone can point me to how to install debug modules on Arch Linux. FYI I get similar behavior when building against zfs-2.2.1-staging
as well.
@amotin
[ 87.222736] VERIFY3(lrc->lrc_reclen >= offsetof(lr_clone_range_t, lr_bps[lr->lr_nbps])) failed (200 >= 6123802872946712264)
[ 87.222740] PANIC at zil.c:628:zil_claim_clone_range()
[ 87.222741] Showing stack for process 4413
[ 87.222742] CPU: 5 PID: 4413 Comm: dmu_objset_find Tainted: G U OE 6.6.1-zen1-1-zen #1 a7d7fb502c6beb735ecb7063553d02465ffd614d
[ 87.222745] Hardware name: Dell Inc. Latitude E5470/06DNG5, BIOS 1.34.3 11/20/2022
[ 87.222746] Call Trace:
[ 87.222748] <TASK>
[ 87.222749] dump_stack_lvl+0x47/0x60
[ 87.222756] spl_panic+0x100/0x120 [spl 78c1a0ca9bb93336388f1ff6b9e2a94e93196bf8]
[ 87.222812] zil_claim_log_record+0x26b/0x2a0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.223125] zil_parse+0x66a/0xaf0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.223398] ? __pfx_zil_claim_log_record+0x10/0x10 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.223679] ? __pfx_zil_claim_log_block+0x10/0x10 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.223947] ? dnode_create+0x1b3/0x320 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.224227] ? __cv_init+0x6f/0x150 [spl 78c1a0ca9bb93336388f1ff6b9e2a94e93196bf8]
[ 87.224241] ? rrw_exit+0xe1/0x2f0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.224525] ? spa_config_exit+0xd6/0x1c0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.224876] zil_check_log_chain+0x119/0x1f0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.225140] dmu_objset_find_dp_impl+0x159/0x550 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.225492] dmu_objset_find_dp_cb+0x29/0x40 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
[ 87.225906] taskq_thread+0x30b/0x7a0 [spl 78c1a0ca9bb93336388f1ff6b9e2a94e93196bf8]
[ 87.225931] ? __pfx_default_wake_function+0x10/0x10
[ 87.225937] ? __pfx_taskq_thread+0x10/0x10 [spl 78c1a0ca9bb93336388f1ff6b9e2a94e93196bf8]
[ 87.225950] kthread+0xe5/0x120
[ 87.225952] ? __pfx_kthread+0x10/0x10
[ 87.225954] ret_from_fork+0x31/0x50
[ 87.225957] ? __pfx_kthread+0x10/0x10
[ 87.225959] ret_from_fork_asm+0x1b/0x30
[ 87.225963] </TASK>
$ cat /proc/spl/kstat/zfs/dbgmsg
1699903552 ffff9248456e4200 spa.c:6467:spa_tryimport(): spa_tryimport: importing rpool
1699903552 ffff9248456e4200 spa_misc.c:427:spa_load_note(): spa_load($import, config trusted): LOADING
1699903552 ffff924846c94200 vdev.c:162:vdev_dbgmsg(): disk vdev '/dev/sda2': probe done, cant_read=0 cant_write=1
1699903552 ffff9248456e4200 vdev.c:162:vdev_dbgmsg(): disk vdev '/dev/sda2': best uberblock found for spa $import. txg 11024569
1699903552 ffff9248456e4200 spa_misc.c:427:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=11024569
1699903552 ffff9248456e4200 vdev.c:2501:vdev_copy_path_impl(): vdev_copy_path: vdev 3387836239050004014: path changed from '/dev/nvme0n1p2' to '/dev/sda2'
1699903552 ffff924846c94200 vdev.c:162:vdev_dbgmsg(): disk vdev '/dev/sda2': probe done, cant_read=0 cant_write=1
1699903552 ffff9248456e4200 dprintf: brt.c:875:brt_vdevs_expand(): BRT VDEVs expanded from 0 to 1.
1699903552 ffff9248456e4200 dprintf: brt.c:724:brt_vdev_realloc(): BRT VDEV 0 initiated.
1699903552 ffff9248456e4200 dprintf: brt.c:783:brt_vdev_load(): MOS BRT VDEV com.fudosecurity:brt:vdev:0 loaded: mos_brtvdev=1427, mos_entries=1426
1699903552 ffff9248456e4200 spa.c:8709:spa_async_request(): spa=$import async request task=2048
1699903552 ffff9248456e4200 spa_misc.c:427:spa_load_note(): spa_load($import, config trusted): LOADED
1699903552 ffff9248456e4200 spa_misc.c:427:spa_load_note(): spa_load($import, config trusted): UNLOADING
1699903552 ffff9248456e4200 dprintf: brt.c:806:brt_vdev_dealloc(): BRT VDEV 0 deallocated.
1699903552 ffff9248456e4200 metaslab.c:1679:spa_set_allocator(): spa allocator: dynamic
1699903552 ffff9248456e4200 spa.c:6319:spa_import(): spa_import: importing rpool
1699903552 ffff9248456e4200 spa_misc.c:427:spa_load_note(): spa_load(rpool, config trusted): LOADING
1699903552 ffff924846e00000 vdev.c:162:vdev_dbgmsg(): disk vdev '/dev/sda2': probe done, cant_read=0 cant_write=1
1699903552 ffff9248456e4200 vdev.c:162:vdev_dbgmsg(): disk vdev '/dev/sda2': best uberblock found for spa rpool. txg 11024569
1699903552 ffff9248456e4200 spa_misc.c:427:spa_load_note(): spa_load(rpool, config untrusted): using uberblock with txg=11024569
1699903552 ffff9248456e4200 vdev.c:2501:vdev_copy_path_impl(): vdev_copy_path: vdev 3387836239050004014: path changed from '/dev/nvme0n1p2' to '/dev/sda2'
1699903552 ffff924846e00000 vdev.c:162:vdev_dbgmsg(): disk vdev '/dev/sda2': probe done, cant_read=0 cant_write=0
1699903552 ffff9248456e4200 spa_misc.c:427:spa_load_note(): spa_load(rpool, config trusted): read 31 log space maps (31 total blocks - blksz = 131072 bytes) in 11 ms
1699903552 ffff9248456e4200 dprintf: brt.c:875:brt_vdevs_expand(): BRT VDEVs expanded from 0 to 1.
1699903552 ffff9248456e4200 dprintf: brt.c:724:brt_vdev_realloc(): BRT VDEV 0 initiated.
1699903552 ffff9248456e4200 dprintf: brt.c:783:brt_vdev_load(): MOS BRT VDEV com.fudosecurity:brt:vdev:0 loaded: mos_brtvdev=1427, mos_entries=1426
1699903553 ffff924846ca0000 metaslab.c:2487:metaslab_load_impl(): metaslab_load: txg 0, spa rpool, vdev_id 0, ms_id 214, smp_length 304800, unflushed_allocs 143360, unflushed_frees 143360, freed 0, defer 0 + 0, unloaded time 87087 ms, loading_time 6 ms, ms_max_size 8254824448, max size error 8254693376, old_weight 800000000000001, new_weight 800000000000001
1699903556 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83250, spa tank, vdev_id 0, ms_id 107, unflushed_allocs 6248960, unflushed_frees 2672128, appended 3176 bytes
1699903560 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83251, spa tank, vdev_id 0, ms_id 101, unflushed_allocs 414208, unflushed_frees 330240, appended 824 bytes
1699903565 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83252, spa tank, vdev_id 0, ms_id 15, unflushed_allocs 4703232, unflushed_frees 2959360, appended 2256 bytes
1699903570 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83253, spa tank, vdev_id 0, ms_id 111, unflushed_allocs 372224, unflushed_frees 1364480, appended 1360 bytes
1699903575 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83254, spa tank, vdev_id 0, ms_id 72, unflushed_allocs 2178560, unflushed_frees 1269760, appended 1416 bytes
1699903576 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83255, spa tank, vdev_id 0, ms_id 83, unflushed_allocs 519680, unflushed_frees 424960, appended 1592 bytes
1699903581 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83256, spa tank, vdev_id 0, ms_id 17, unflushed_allocs 0, unflushed_frees 8704, appended 24 bytes
1699903586 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83257, spa tank, vdev_id 0, ms_id 87, unflushed_allocs 3875328, unflushed_frees 1471488, appended 2016 bytes
1699903591 ffff924d4c8dc200 metaslab.c:3981:metaslab_flush(): flushing: txg 83258, spa tank, vdev_id 0, ms_id 0, unflushed_allocs 662016, unflushed_frees 643584, appended 544 bytes
@dpendolino I did the following using zfs-dkms-git aur:
create /etc/sysconfig/zfs
to enable debug build for dkms
$ cat /etc/sysconfig/zfs
ZFS_DKMS_ENABLE_DEBUG=y
download the patch from https://github.com/openzfs/zfs/pull/15517 with into the aur zfs-dkms-git
package:
$ wget https://github.com/openzfs/zfs/pull/15517.patch
modify PKGBUILD
like this:
$ git diff
diff --git a/PKGBUILD b/PKGBUILD
index fb2ee51..c543757 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -10,7 +10,7 @@
#
pkgname=zfs-dkms-git
-pkgver=2.2.99.r63.g8e20e0ff39
+pkgver=2.2.99.r206.g786641dcf9
pkgrel=1
epoch=2
pkgdesc='Kernel modules for the Zettabyte File System.'
@@ -23,11 +23,14 @@ provides=("ZFS-MODULE=${pkgver}" "SPL-MODULE=${pkgver}" "${pkgname%-git}=${pkgve
conflicts=("${pkgname%-git}" 'spl-dkms')
replaces=('spl-dkms-git')
source=("git+https://github.com/openzfs/zfs.git"
- "0001-only-build-the-module-in-dkms.conf.patch")
+ "0001-only-build-the-module-in-dkms.conf.patch"
+ "15517.patch")
sha256sums=('SKIP'
- '539f325e56443554f9b87baff33948b91a280ec1daadcb0c636b105252fcd0f5')
+ '539f325e56443554f9b87baff33948b91a280ec1daadcb0c636b105252fcd0f5'
+ 'b6e536d07aefebe471627d4a93460032d8721c9005b8de287e57e9213faa4003')
b2sums=('SKIP'
- 'a8ab5da81d214e7801f0f8cdf77c076c714a3f17292df15ca35fcf7aef2c4d505348797e3b1da7590ea303ff488490ddba49e6f9e3f8a0bcc975894d51d97c2b')
+ 'a8ab5da81d214e7801f0f8cdf77c076c714a3f17292df15ca35fcf7aef2c4d505348797e3b1da7590ea303ff488490ddba49e6f9e3f8a0bcc975894d51d97c2b'
+ '6d734e2d5f95c43433471e051cbbab63def068f09c0ea9260e27e647da963a58610dff9f56247de35394e4f3c38763560f1ff8c9dd8d3706eb47d9a51a5bfccf')
pkgver() {
cd zfs
@@ -39,7 +42,8 @@ prepare() {
cd zfs
patch -p1 -i ../0001-only-build-the-module-in-dkms.conf.patch
-
+ patch -p1 -i ../15517.patch
+ sed -i "s/CDDL/GPL/g" META
# remove unneeded sections from module build
sed -ri "/AC_CONFIG_FILES/,/]\)/{
/AC_CONFIG_FILES/n
create a snapshot before installing and also clone zfs-utils-git
from aur run makepkg -C
for both and install both at the same time:
install with pacman -U /home/user/zfs-dkms-git/zfs-dkms-git-2:2.2.99.r206.g786641dcf9-1-any.pkg.tar.zst \ /home/user/zfs-utils-git/zfs-utils-git/zfs-utils-git-2:2.2.99.r206.g786641dcf9-1-x86_64.pkg.tar.zst
@amotin also tried to trigger https://github.com/openzfs/zfs/issues/15485
Nov 13 20:47:06 kernel: VERIFY(arc_released(db->db_buf)) failed
Nov 13 20:47:06 kernel: PANIC at dbuf.c:2135:dbuf_redirty()
Nov 13 20:47:06 kernel: Showing stack for process 7759
Nov 13 20:47:06 kernel: CPU: 6 PID: 7759 Comm: genvdso Tainted: G U OE 6.6.1-zen1-1-zen #1 a7d7fb502c6beb735ecb7063553d02465ffd614d
Nov 13 20:47:06 kernel: Hardware name: Dell Inc. Latitude E5470/06DNG5, BIOS 1.34.3 11/20/2022
Nov 13 20:47:06 kernel: Call Trace:
Nov 13 20:47:06 kernel: <TASK>
Nov 13 20:47:06 kernel: dump_stack_lvl+0x47/0x60
Nov 13 20:47:06 kernel: spl_panic+0x100/0x120 [spl 78c1a0ca9bb93336388f1ff6b9e2a94e93196bf8]
Nov 13 20:47:06 kernel: ? bplist_append+0x13e/0x1a0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: dbuf_redirty+0xbb/0xc0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: dbuf_dirty+0x101f/0x17e0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: ? dbuf_noread+0x176/0x400 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: dmu_write_impl+0xbf/0x150 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: dmu_write+0xcf/0x190 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: zfs_putpage+0x4a8/0x900 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: zpl_putfolio+0x89/0x1c0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: write_cache_pages+0xf2/0x390
Nov 13 20:47:06 kernel: ? __pfx_zpl_putfolio+0x10/0x10 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: zpl_writepages+0xb2/0x1d0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: ? __pfx_zpl_writepages+0x10/0x10 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: do_writepages+0x89/0x630
Nov 13 20:47:06 kernel: filemap_write_and_wait_range+0x109/0x160
Nov 13 20:47:06 kernel: zpl_fsync+0xb3/0x1d0 [zfs 643967e62c268d70848fe1e8056e02692ef1ed88]
Nov 13 20:47:06 kernel: __x64_sys_msync+0x1db/0x400
Nov 13 20:47:06 kernel: do_syscall_64+0x5d/0x90
Nov 13 20:47:06 kernel: ? exc_page_fault+0x7f/0x180
Nov 13 20:47:06 kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Nov 13 20:47:06 kernel: RIP: 0033:0x7f9076e93174
Nov 13 20:47:06 kernel: Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d f5 31 0d 00 00 74 13 b8 1a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 89 54 24 1c 48 89
Nov 13 20:47:06 kernel: RSP: 002b:00007ffdff0f3b58 EFLAGS: 00000202 ORIG_RAX: 000000000000001a
Nov 13 20:47:06 kernel: RAX: ffffffffffffffda RBX: 00007f9076fc8000 RCX: 00007f9076e93174
Nov 13 20:47:06 kernel: RDX: 0000000000000004 RSI: 0000000000000e00 RDI: 00007f9076fc7000
Nov 13 20:47:06 kernel: RBP: 00007f9076fc7000 R08: 0000000000000003 R09: 0000000000000000
Nov 13 20:47:06 kernel: R10: 00007f9076d97958 R11: 0000000000000202 R12: 00007ffdff0f4a06
Nov 13 20:47:06 kernel: R13: 00007ffdff0f3d40 R14: 00007f9077007000 R15: 0000000000000e00
Nov 13 20:47:06 kernel: </TASK>
it happens when running OpenWrt Kernel Build - this line:
Entering directory '/home/mt/Projects/weimarnetz/openwrt/build_dir/target-mips_24kc_musl/linux-ath79_generic/linux-5.15.137'
GENVDSO arch/mips/vdso/vdso-image.c
I'm not so familiar with linux kernel builds but building openwrt from scratch like in https://github.com/openzfs/zfs/issues/15485#issuecomment-1804233854 triggers it reliable here. The failed import also happened after a crash caused by this (using older zfs git from 2023.10.26.r8843.g043c6ee3b6
.
hope this somehow helps.
source of genvdso.c
that triggers the problem looks like it's mmap()
related:
wn/openwrt - [main] » cat ./build_dir/target-mips_24kc_musl/linux-ath79_generic/linux-5.15.137/arch/mips/vdso/genvdso.c
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2015 Imagination Technologies
* Author: Alex Smith <alex.smith@imgtec.com>
*/
/*
* This tool is used to generate the real VDSO images from the raw image. It
* first patches up the MIPS ABI flags and GNU attributes sections defined in
* elf.S to have the correct name and type. It then generates a C source file
* to be compiled into the kernel containing the VDSO image data and a
* mips_vdso_image struct for it, including symbol offsets extracted from the
* image.
*
* We need to be passed both a stripped and unstripped VDSO image. The stripped
* image is compiled into the kernel, but we must also patch up the unstripped
* image's ABI flags sections so that it can be installed and used for
* debugging.
*/
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <byteswap.h>
#include <elf.h>
#include <errno.h>
#include <fcntl.h>
#include <inttypes.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
/* Define these in case the system elf.h is not new enough to have them. */
#ifndef SHT_GNU_ATTRIBUTES
# define SHT_GNU_ATTRIBUTES 0x6ffffff5
#endif
#ifndef SHT_MIPS_ABIFLAGS
# define SHT_MIPS_ABIFLAGS 0x7000002a
#endif
enum {
ABI_O32 = (1 << 0),
ABI_N32 = (1 << 1),
ABI_N64 = (1 << 2),
ABI_ALL = ABI_O32 | ABI_N32 | ABI_N64,
};
/* Symbols the kernel requires offsets for. */
static struct {
const char *name;
const char *offset_name;
unsigned int abis;
} vdso_symbols[] = {
{ "__vdso_sigreturn", "off_sigreturn", ABI_O32 },
{ "__vdso_rt_sigreturn", "off_rt_sigreturn", ABI_ALL },
{}
};
static const char *program_name;
static const char *vdso_name;
static unsigned char elf_class;
static unsigned int elf_abi;
static bool need_swap;
static FILE *out_file;
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
# define HOST_ORDER ELFDATA2LSB
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
# define HOST_ORDER ELFDATA2MSB
#endif
#define BUILD_SWAP(bits) \
static uint##bits##_t swap_uint##bits(uint##bits##_t val) \
{ \
return need_swap ? bswap_##bits(val) : val; \
}
BUILD_SWAP(16)
BUILD_SWAP(32)
BUILD_SWAP(64)
#define __FUNC(name, bits) name##bits
#define _FUNC(name, bits) __FUNC(name, bits)
#define FUNC(name) _FUNC(name, ELF_BITS)
#define __ELF(x, bits) Elf##bits##_##x
#define _ELF(x, bits) __ELF(x, bits)
#define ELF(x) _ELF(x, ELF_BITS)
/*
* Include genvdso.h twice with ELF_BITS defined differently to get functions
* for both ELF32 and ELF64.
*/
#define ELF_BITS 64
#include "genvdso.h"
#undef ELF_BITS
#define ELF_BITS 32
#include "genvdso.h"
#undef ELF_BITS
static void *map_vdso(const char *path, size_t *_size)
{
int fd;
struct stat stat;
void *addr;
const Elf32_Ehdr *ehdr;
fd = open(path, O_RDWR);
if (fd < 0) {
fprintf(stderr, "%s: Failed to open '%s': %s\n", program_name,
path, strerror(errno));
return NULL;
}
if (fstat(fd, &stat) != 0) {
fprintf(stderr, "%s: Failed to stat '%s': %s\n", program_name,
path, strerror(errno));
close(fd);
return NULL;
}
addr = mmap(NULL, stat.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
0);
if (addr == MAP_FAILED) {
fprintf(stderr, "%s: Failed to map '%s': %s\n", program_name,
path, strerror(errno));
close(fd);
return NULL;
}
/* ELF32/64 header formats are the same for the bits we're checking. */
ehdr = addr;
if (memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0) {
fprintf(stderr, "%s: '%s' is not an ELF file\n", program_name,
path);
close(fd);
return NULL;
}
elf_class = ehdr->e_ident[EI_CLASS];
switch (elf_class) {
case ELFCLASS32:
case ELFCLASS64:
break;
default:
fprintf(stderr, "%s: '%s' has invalid ELF class\n",
program_name, path);
close(fd);
return NULL;
}
switch (ehdr->e_ident[EI_DATA]) {
case ELFDATA2LSB:
case ELFDATA2MSB:
need_swap = ehdr->e_ident[EI_DATA] != HOST_ORDER;
break;
default:
fprintf(stderr, "%s: '%s' has invalid ELF data order\n",
program_name, path);
close(fd);
return NULL;
}
if (swap_uint16(ehdr->e_machine) != EM_MIPS) {
fprintf(stderr,
"%s: '%s' has invalid ELF machine (expected EM_MIPS)\n",
program_name, path);
close(fd);
return NULL;
} else if (swap_uint16(ehdr->e_type) != ET_DYN) {
fprintf(stderr,
"%s: '%s' has invalid ELF type (expected ET_DYN)\n",
program_name, path);
close(fd);
return NULL;
}
*_size = stat.st_size;
close(fd);
return addr;
}
static bool patch_vdso(const char *path, void *vdso)
{
if (elf_class == ELFCLASS64)
return patch_vdso64(path, vdso);
else
return patch_vdso32(path, vdso);
}
static bool get_symbols(const char *path, void *vdso)
{
if (elf_class == ELFCLASS64)
return get_symbols64(path, vdso);
else
return get_symbols32(path, vdso);
}
int main(int argc, char **argv)
{
const char *dbg_vdso_path, *vdso_path, *out_path;
void *dbg_vdso, *vdso;
size_t dbg_vdso_size, vdso_size, i;
program_name = argv[0];
if (argc < 4 || argc > 5) {
fprintf(stderr,
"Usage: %s <debug VDSO> <stripped VDSO> <output file> [<name>]\n",
program_name);
return EXIT_FAILURE;
}
dbg_vdso_path = argv[1];
vdso_path = argv[2];
out_path = argv[3];
vdso_name = (argc > 4) ? argv[4] : "";
dbg_vdso = map_vdso(dbg_vdso_path, &dbg_vdso_size);
if (!dbg_vdso)
return EXIT_FAILURE;
vdso = map_vdso(vdso_path, &vdso_size);
if (!vdso)
return EXIT_FAILURE;
/* Patch both the VDSOs' ABI flags sections. */
if (!patch_vdso(dbg_vdso_path, dbg_vdso))
return EXIT_FAILURE;
if (!patch_vdso(vdso_path, vdso))
return EXIT_FAILURE;
if (msync(dbg_vdso, dbg_vdso_size, MS_SYNC) != 0) {
fprintf(stderr, "%s: Failed to sync '%s': %s\n", program_name,
dbg_vdso_path, strerror(errno));
return EXIT_FAILURE;
} else if (msync(vdso, vdso_size, MS_SYNC) != 0) {
fprintf(stderr, "%s: Failed to sync '%s': %s\n", program_name,
vdso_path, strerror(errno));
return EXIT_FAILURE;
}
out_file = fopen(out_path, "w");
if (!out_file) {
fprintf(stderr, "%s: Failed to open '%s': %s\n", program_name,
out_path, strerror(errno));
return EXIT_FAILURE;
}
fprintf(out_file, "/* Automatically generated - do not edit */\n");
fprintf(out_file, "#include <linux/linkage.h>\n");
fprintf(out_file, "#include <linux/mm.h>\n");
fprintf(out_file, "#include <asm/vdso.h>\n");
fprintf(out_file, "static int vdso_mremap(\n");
fprintf(out_file, " const struct vm_special_mapping *sm,\n");
fprintf(out_file, " struct vm_area_struct *new_vma)\n");
fprintf(out_file, "{\n");
fprintf(out_file, " current->mm->context.vdso =\n");
fprintf(out_file, " (void *)(new_vma->vm_start);\n");
fprintf(out_file, " return 0;\n");
fprintf(out_file, "}\n");
/* Write out the stripped VDSO data. */
fprintf(out_file,
"static unsigned char vdso_data[PAGE_ALIGN(%zu)] __page_aligned_data = {\n\t",
vdso_size);
for (i = 0; i < vdso_size; i++) {
if (!(i % 10))
fprintf(out_file, "\n\t");
fprintf(out_file, "0x%02x, ", ((unsigned char *)vdso)[i]);
}
fprintf(out_file, "\n};\n");
/* Preallocate a page array. */
fprintf(out_file,
"static struct page *vdso_pages[PAGE_ALIGN(%zu) / PAGE_SIZE];\n",
vdso_size);
fprintf(out_file, "struct mips_vdso_image vdso_image%s%s = {\n",
(vdso_name[0]) ? "_" : "", vdso_name);
fprintf(out_file, "\t.data = vdso_data,\n");
fprintf(out_file, "\t.size = PAGE_ALIGN(%zu),\n", vdso_size);
fprintf(out_file, "\t.mapping = {\n");
fprintf(out_file, "\t\t.name = \"[vdso]\",\n");
fprintf(out_file, "\t\t.pages = vdso_pages,\n");
fprintf(out_file, "\t\t.mremap = vdso_mremap,\n");
fprintf(out_file, "\t},\n");
/* Calculate and write symbol offsets to <output file> */
if (!get_symbols(dbg_vdso_path, dbg_vdso)) {
unlink(out_path);
fclose(out_file);
return EXIT_FAILURE;
}
fprintf(out_file, "};\n");
fclose(out_file);
return EXIT_SUCCESS;
}
This happened directly after triggering the issue and rebooting the machine due to the hang (also zfs debug build):
@mtippmann Your last panic looks like it can be a different flavor of the earlier one. I've extended my assertions patch to catch that scenario also. Though my patch only catches the consequences after reboot, I am still not sure what happens before. Is it coincidence to see block cloning involved there or it is the cause/trigger?
@amotin that was super helpful, thanks!
dpendolino@archlinux ~> zfs version
zfs-2.2.99-206_g786641dcf9
zfs-kmod-2.2.99-206_g786641dcf9
I tried the pool import from a separate install and here is the output from dmesg
: dmesg.txt
@mtippmann Your last panic looks like it can be a different flavor of the earlier one. I've extended my assertions patch to catch that scenario also. Though my patch only catches the consequences after reboot, I am still not sure what happens before. Is it coincidence to see block cloning involved there or it is the cause/trigger?
I'll rebuild and post the crash on next reboot - as for block cloning it's used but not intentionally by me.
tank bcloneused 363M -
tank bclonesaved 377M -
tank bcloneratio 2.03x
arch has coreutils 9.4 - cp --reflink=auto
is the default starting from 9.0? all reports of this bug have been gentoo or arch with recent enough versions of coreutils - ubuntu 22.04 still is on 8.x - the pool was created with zfs 2.2. featureset so block cloning is active.
it happened the first time on an encrypted pool also running git at the time - building OpenWrt that triggered the import panic after reboot - from a quick look at genvdso.c
it's maybe related to mmap()
that something is off.
I then recreated the pool without encryption and I can trigger the oops when building OpenWrt but import does still seems to work fine (except now with the debug build)
I know that's not super helpful - I don't think I've hit that fixed bug regarding cloning on non-encrypted / encrypted datasets - so this here seems to be something else.
This appeared in dmesg after running the compile again
After reboot this appeared
I have to restore from backup as it not possible to import the pool anymore.
Correction: read-only import still works fine.
Last screenshots are with the updated patch from https://github.com/openzfs/zfs/pull/15517.
I'm happy to keep testing new patches if folks think we're close, but if not, then I may just need to rebuild in order to get my laptop up and running again.
On a pool with current git zfs and block cloning disabled the issue can't be triggered by me anymore. So I guess it's related to block cloning.
@mtippmann is it possible to disable block cloning on a pool that can't be import read/write? I assume not, but it would be really nice to find a way to not have to rebuild.
@dpendolino When poll corruption happened -- it already happened, according to provided panics with my assertions patch ZIL is really corrupted. We should diagnose what is going on before the reboot, what causes the original panic and probably some memory corruptions we see as corrupted ZIL.
@amotin gotcha, then let me know anything else you need from me, and I'll rebuild later tonight.
@mtippmann is it possible to disable block cloning on a pool that can't be import read/write? I assume not, but it would be really nice to find a way to not have to rebuild.
I don't know - the new pool without block-cloning was recreated with zpool create -o compatability=openzfs-2.1-linux
- unfortunatly it's also my work notebook that crashed and I had to restore from the broken read-only pool and recreated the pool without block-cloning - I can test further maybe on another machine through if I find the time.
I think I've found the cause of crash during the encrypted pool import: https://github.com/openzfs/zfs/pull/15543 -- encryption for block clone ZIL records was not done correctly. It does not explain the original crash you see during the build, that is likely a different issue.
PS: It will not fix already corrupted pools, only prevent new corruptions.
I think I've found the cause of crash during the encrypted pool import: #15543 -- encryption for block clone ZIL records was not done correctly. It does not explain the original crash you see during the build, that is likely a different issue.
PS: It will not fix already corrupted pools, only prevent new corruptions.
Tested https://github.com/openzfs/zfs/pull/15566 and https://github.com/openzfs/zfs/pull/15543 with zfs git on a pool just upgraded to all features in current git - and while import still works (so #15543 seems to work) it still crashes on building OpenWrt during vdso generation... @amotin unfortunatly no debug build - if it's useful I can rerun with debug enabled @robn this also only happens when block cloning is active and not without block cloning. might be interesting.
[mt@futro2 ~]$ zpool version
zfs-2.2.99-217_ga94860a6de
zfs-kmod-2.2.99-217_ga94860a6de
Nov 23 16:09:59 futro2 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Nov 23 16:09:59 futro2 kernel: #PF: supervisor read access in kernel mode
Nov 23 16:09:59 futro2 kernel: #PF: error_code(0x0000) - not-present page
Nov 23 16:09:59 futro2 kernel: PGD 15d7d8067 P4D 15d7d8067 PUD 1689a1067 PMD 0
Nov 23 16:09:59 futro2 kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Nov 23 16:09:59 futro2 kernel: CPU: 0 PID: 303 Comm: dp_sync_taskq Tainted: P U OE 6.6.1-arch1-1 #1 be166a630cd909acf8820643140e9106c6ea80e6
Nov 23 16:09:59 futro2 kernel: Hardware name: FUJITSU FUTRO S740/D3544-A1, BIOS V5.0.0.13 R1.13.0 for D3544-A1x 09/23/2022
Nov 23 16:09:59 futro2 kernel: RIP: 0010:arc_write+0x6c/0x450 [zfs]
Nov 23 16:09:59 futro2 kernel: Code: 7a 30 48 89 b5 50 ff ff ff 49 8b 72 40 4d 8b 5a 20 48 89 95 60 ff ff ff 4d 8b 42 28 41 8b 12 48 89 8d 58 ff ff ff 45 8b 72 38 <49> 8b 1c 24 89 bd 4c ff ff ff 48 89 b5 40 ff ff ff 65 48 8b 0c 25
Nov 23 16:09:59 futro2 kernel: RSP: 0000:ffffbef00152b9d0 EFLAGS: 00010282
Nov 23 16:09:59 futro2 kernel: RAX: ffffbef00152bb50 RBX: ffff98c86a3d7a08 RCX: ffff98c839e83650
Nov 23 16:09:59 futro2 kernel: RDX: 0000000000000001 RSI: ffffbef00152bb30 RDI: 0000000000000003
Nov 23 16:09:59 futro2 kernel: RBP: ffffbef00152baa0 R08: ffff98c86a3d7a08 R09: 0000000000000000
Nov 23 16:09:59 futro2 kernel: R10: ffffbef00152bab0 R11: ffffffffc06bcd30 R12: 0000000000000000
Nov 23 16:09:59 futro2 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Nov 23 16:09:59 futro2 kernel: FS: 0000000000000000(0000) GS:ffff98c87fc00000(0000) knlGS:0000000000000000
Nov 23 16:09:59 futro2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 23 16:09:59 futro2 kernel: CR2: 0000000000000000 CR3: 0000000159b3a000 CR4: 0000000000350ef0
Nov 23 16:09:59 futro2 kernel: Call Trace:
Nov 23 16:09:59 futro2 kernel: <TASK>
Nov 23 16:09:59 futro2 kernel: ? __die+0x23/0x70
Nov 23 16:09:59 futro2 kernel: ? page_fault_oops+0x171/0x4e0
Nov 23 16:09:59 futro2 kernel: ? zio_add_child_first+0x112/0x130 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? exc_page_fault+0x7f/0x180
Nov 23 16:09:59 futro2 kernel: ? asm_exc_page_fault+0x26/0x30
Nov 23 16:09:59 futro2 kernel: ? __pfx_dbuf_write_done+0x10/0x10 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? arc_write+0x6c/0x450 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? __pfx_dbuf_write_done+0x10/0x10 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? __pfx_dbuf_write_ready+0x10/0x10 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: dbuf_write+0x3d1/0x5d0 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? __pfx_dbuf_write_ready+0x10/0x10 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? __pfx_dbuf_write_done+0x10/0x10 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? __slab_free+0xf1/0x330
Nov 23 16:09:59 futro2 kernel: dbuf_sync_leaf+0x139/0x710 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? zpl_get_file_info+0x87/0x240 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? dbuf_rele_and_unlock+0xfa/0x500 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: dbuf_sync_list+0xc3/0x120 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: dnode_sync+0x413/0xae0 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? dnode_multilist_index_func+0x98/0xb0 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: sync_dnodes_task+0x89/0x180 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: taskq_thread+0x2c0/0x4e0 [spl e4ee0a961924a9370241e4c0dece4cb3c96731d0]
Nov 23 16:09:59 futro2 kernel: ? __pfx_default_wake_function+0x10/0x10
Nov 23 16:09:59 futro2 kernel: ? __pfx_sync_meta_dnode_task+0x10/0x10 [zfs 850e15125487c3a229178dc9ba52850ced0e1c95]
Nov 23 16:09:59 futro2 kernel: ? __pfx_taskq_thread+0x10/0x10 [spl e4ee0a961924a9370241e4c0dece4cb3c96731d0]
Nov 23 16:09:59 futro2 kernel: kthread+0xe5/0x120
Nov 23 16:09:59 futro2 kernel: ? __pfx_kthread+0x10/0x10
Nov 23 16:09:59 futro2 kernel: ret_from_fork+0x31/0x50
Nov 23 16:09:59 futro2 kernel: ? __pfx_kthread+0x10/0x10
Nov 23 16:09:59 futro2 kernel: ret_from_fork_asm+0x1b/0x30
Nov 23 16:09:59 futro2 kernel: </TASK>
Nov 23 16:09:59 futro2 kernel: Modules linked in: snd_sof_pci_intel_apl snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils intel_pmc_bxt soundwire_generic_allocation intel_telemetry_pltdrv soundwire_bus intel_punit_ipc intel_telemetry_core snd_soc_avs x86_pkg_temp_thermal intel_powerclamp snd_soc_hda_codec coretemp snd_soc_skl kvm_intel snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc kvm snd_soc_sst_dsp snd_soc_acpi_intel_match snd_hda_codec_hdmi snd_soc_acpi irqbypass snd_soc_core snd_hda_codec_realtek crct10dif_pclmul crc32_pclmul snd_compress snd_hda_codec_generic crc32c_intel ac97_bus ledtrig_audio snd_pcm_dmaengine snd_hda_intel polyval_generic spi_pxa2xx_platform snd_intel_dspcfg dw_dmac gf128mul ghash_clmulni_intel sha512_ssse3 snd_intel_sdw_acpi aesni_intel crypto_simd mei_hdcp mei_pxp intel_rapl_msr ee1004 snd_hda_codec cryptd rapl snd_hda_core intel_cstate pcspkr processor_thermal_device_pci_legacy snd_hwdep
Nov 23 16:09:59 futro2 kernel: processor_thermal_device r8169 snd_pcm processor_thermal_rfim realtek processor_thermal_mbox processor_thermal_rapl mdio_devres snd_timer intel_lpss_pci i2c_i801 intel_lpss i2c_smbus libphy idma64 snd intel_rapl_common mei_me mei soundcore intel_soc_dts_iosf cfg80211 fujitsu_laptop sparse_keymap int3400_thermal int3403_thermal acpi_thermal_rel int3406_thermal int340x_thermal_zone rfkill mac_hid crypto_user loop fuse dm_mod ip_tables x_tables i915 i2c_algo_bit drm_buddy ttm intel_gtt xhci_pci drm_display_helper xhci_pci_renesas cec video wmi usbhid zfs(POE) spl(OE)
Nov 23 16:09:59 futro2 kernel: CR2: 0000000000000000
Nov 23 16:09:59 futro2 kernel: ---[ end trace 0000000000000000 ]---
Nov 23 16:09:59 futro2 kernel: RIP: 0010:arc_write+0x6c/0x450 [zfs]
Nov 23 16:09:59 futro2 kernel: Code: 7a 30 48 89 b5 50 ff ff ff 49 8b 72 40 4d 8b 5a 20 48 89 95 60 ff ff ff 4d 8b 42 28 41 8b 12 48 89 8d 58 ff ff ff 45 8b 72 38 <49> 8b 1c 24 89 bd 4c ff ff ff 48 89 b5 40 ff ff ff 65 48 8b 0c 25
Nov 23 16:09:59 futro2 kernel: RSP: 0000:ffffbef00152b9d0 EFLAGS: 00010282
Nov 23 16:09:59 futro2 kernel: RAX: ffffbef00152bb50 RBX: ffff98c86a3d7a08 RCX: ffff98c839e83650
Nov 23 16:09:59 futro2 kernel: RDX: 0000000000000001 RSI: ffffbef00152bb30 RDI: 0000000000000003
Nov 23 16:09:59 futro2 kernel: RBP: ffffbef00152baa0 R08: ffff98c86a3d7a08 R09: 0000000000000000
Nov 23 16:09:59 futro2 kernel: R10: ffffbef00152bab0 R11: ffffffffc06bcd30 R12: 0000000000000000
Nov 23 16:09:59 futro2 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Nov 23 16:09:59 futro2 kernel: FS: 0000000000000000(0000) GS:ffff98c87fc00000(0000) knlGS:0000000000000000
Nov 23 16:09:59 futro2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 23 16:09:59 futro2 kernel: CR2: 0000000000000000 CR3: 0000000159b3a000 CR4: 0000000000350ef0
Nov 23 16:09:59 futro2 kernel: note: dp_sync_taskq[303] exited with irqs disabled
Setting zfs_dmu_offset_next_sync=0
avoids the oops even if block cloning is active.
System information
Describe the problem you're observing
This is an encrypted single disk root pool that will no longer boot. Any attempts to import the pool on a live environment causes the following page fault:
dmesg.txt
Describe how to reproduce the problem
Include any warning/errors/backtraces from the system logs
with the
readonly
flag set, the pool will import and the data is there.