Open arahnale opened 7 months ago
This reminds me of #15745, could you possibly try a memtest for a bit?
I do not think that the problem is hardware, as it began to reproduce after the upgrade from zfs 2.1.13, on all servers. Kernels 5.4 and 5.15. Uses RAM with ECC error correction. There was no problem on 2.1.13.
Options used when loading
options zfs zfs_autoimport_disable=0
options zfs zfs_nocacheflush=1
options zfs zfs_prefetch_disable=1
options zfs zfs_dmu_offset_next_sync=0
options zfs zfs_arc_max=32212254720
options zfs zfs_arc_meta_limit_percent=100
options zfs zfs_arc_dnode_limit_percent=75
It might not be, but could you try it while I try reproducing it locally at the moment?
At the moment, I will try to unload the server. It was noticed that the problem was reproduced when the arc max size was increased.
echo 107374182400 > /sys/module/zfs/parameters/zfs_arc_max
I had an automation script running that increased the size of zfs_arc_max depending on the needs of the file system.
I will try to reproduce this problem.
I removed zfs_arc_max from the settings, now it began to take 50% of the memory of the entire server. When the server ran out of memory, the kernel panicked again.
I'm still writing a replay test.
Can you tell us anything about the settings on the dataset(s) involved, or the pool configuration?
Because that crash seems to just be in trying to memset a wild pointer, which is not the most informative for why it might have one of those in the first place.
I don't think Ubuntu has prebuilt kASAN kernels, but that would probably be what i'd like to test against if you somehow have one laying around...
I use the mirror zpool
zpool status
pool: zpool
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
zpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme-INTEL_SSDPE2KX080T8_BTLJ122300VB8P0HGN ONLINE 0 0 0
nvme-INTEL_SSDPE2KX080T8_BTLJ1223012X8P0HGN ONLINE 0 0 0
zpool was created as a mirror with command
zpool create -o ashift=12 -o autotrim=off zpool mirror $NVME1 $NVME2 -f
Created 3 datasets
zfs set logbias=throughput zpool
zfs set sync=standard zpool
zfs set atime=off zpool
zfs set primarycache=all zpool
zfs set secondarycache=all zpool
zfs create zpool/home
zfs create zpool/mysql
zfs create zpool/reserved
zfs set recordsize=32K zpool/home
zfs set recordsize=16K zpool/mysql
zfs set reservation=100G zpool/reserved
zfs set sync=standard zpool/home
zfs set sync=standard zpool/mysql
zfs set sync=standard zpool
zfs set primarycache=metadata zpool/mysql
zfs set secondarycache=none zpool/mysql
I haven't been able to synthetically reproduce the core crash yet.
In my kernel enable CONFIG_KASAN_STACK
grep CONFIG_KASAN /boot/config-5.4.0-162-generic
# CONFIG_KASAN is not set
CONFIG_KASAN_STACK=1
Is this suitable for debugging?
I think those are heap crashes, so that won't help here.
I was able to reproduce the kernel crash when using zfs 2.2.2 I wrote a script for a memory leak
#include <stdio.h>
// malloc calloc
#include <stdlib.h>
#include <string.h>
typedef struct {
char * str;
unsigned long long int len;
} str_t;
void _mymalloc(str_t *str1 , str_t *str2, unsigned long long int n) {
/** long int size_1gb = 1024 * 1024 * 1024; // размер в байтах (1 гигабайт) */
str1->str = NULL;
str1->len = 0;
printf("Start _malloc %lld\n", n);
while (str1->len < n) {
/** printf("%lld\n", str1->len); */
str1->len += str2->len + 1;
/** printf("realloc\n"); */
str1->str = realloc(str1->str, str1->len * sizeof(char) + str2->len * sizeof(char));
/** printf("strcat\n"); */
strcat(str1->str, str2->str);
}
}
int main() {
printf("Start\n");
unsigned long long int size = 0;
str_t temp;
str_t temp1, temp2, temp3, temp4;
temp.str = "abracadabra\0";
temp.len = strlen(temp.str);
_mymalloc(&temp1, &temp , 1024);
_mymalloc(&temp2, &temp1 , 1024 * 1024 * 10);
_mymalloc(&temp3, &temp2 , 1024 * 1024 * 1024);
_mymalloc(&temp4, &temp3 , 1024 * 1024 * 4024);
return 0 ;
}
and started the load on the dataset
while true; do dd if=/dev/urandom count=10M bs=1 | bzip2 -1 > /home/random.bin ; rm -f /home/random.bin ; done
I am ready to provide a server for testing if you agree to help with testing.
this problem still appears on the latest versions of ZFS: 2.1.15 and 2.2.4 all that helped us to no longer get a kernel panic was to roll back to 2.1.4 please look into the problem, the latest versions are NOT STABLE. changes that broke the ZFS code base were made between releases in 2.1.4-2.1.15. Then they moved to the 2.2 code base and these problems continue. We kindly ask you to check all commits from 2.1.4 to 2.1.15, because This bug will only make things worse in the future.
System information
Describe the problem you're observing
A kernel panic occurs randomly, about one to three hours after the system starts working.
Describe how to reproduce the problem
The problem could not be reproduced on purpose.
Include any warning/errors/backtraces from the system logs
Kernel log
``` Jan 18 00:29:20 vh422 kernel: python3[18546]: segfault at 0 ip 0000000000000000 sp 00007ffd34d1afd8 error 14 in python3.6[400000+3af000] Jan 18 00:29:20 vh422 kernel: Code: Bad RIP value. Jan 18 00:29:20 vh422 kernel: nginx[64996]: segfault at 8 ip 000055d3696706ab sp 00007ffcca139e20 error 4 in nginx[55d3694c5000+f94000] Jan 18 00:29:20 vh422 kernel: Code: 89 8e 00 e9 52 ff ff ff 48 8d 15 a8 89 8e 00 e9 46 ff ff ff e8 b6 24 fc ff 66 0f 1f 44 00 00 48 83 ec 08 4c 8b 16 48 8b 46 08 <48> 8b 77 08 48 8b 3f 45 31 c9 51 52 45 31 c0 44 89 d1 48 89 c2 e8 Jan 18 00:29:20 vh422 kernel: BUG: unable to handle page fault for address: ffff92bc826df000 Jan 18 00:29:20 vh422 kernel: #PF: supervisor write access in kernel mode Jan 18 00:29:20 vh422 kernel: #PF: error_code(0x0002) - not-present page Jan 18 00:29:20 vh422 kernel: PGD 3f50801067 P4D 3f50801067 PUD 27d903a063 PMD 27c26de063 Jan 18 00:29:20 vh422 kernel: BAD Jan 18 00:29:20 vh422 kernel: Oops: 0002 [#1] SMP NOPTI Jan 18 00:29:20 vh422 kernel: CPU: 41 PID: 22347 Comm: apache2 Tainted: P OE 5.4.0-162-generic #179~18.04.1-Ubuntu Jan 18 00:29:20 vh422 kernel: Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 3.8a 10/28/2022 Jan 18 00:29:20 vh422 kernel: RIP: 0010:memset_erms+0x9/0x10 Jan 18 00:29:20 vh422 kernel: Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1Full log in the attachment kernel.log