happyfish100 / fastdfs

FastDFS is an open source high performance distributed file system (DFS). It's major functions include: file storing, file syncing and file accessing, and design for high capacity and load balance. Wechat/Weixin public account (Chinese Language): fastdfs
GNU General Public License v3.0
8.97k stars 1.98k forks source link

storage异常退出,kernel: dio-p00-r[3][30232]: segfault at 0 ip (null) sp 00007f494735fd78 error 14 in fdfs_storaged[400000+58000] #659

Open heshc opened 11 months ago

heshc commented 11 months ago

fastdfs6.9.5 libfastcommon-1.0.69 libserverframe-1.1.29 最新版本运行一段时间后storage异常退出,且storage 无任何异常日志,查看服务器系统日志有一句kernel报错 kernel: dio-p00-r[3][30232]: segfault at 0 ip (null) sp 00007f494735fd78 error 14 in fdfs_storaged[400000+58000]

heshc commented 11 months ago

运行过程中又遇到了这种现象 storage 异常退出,烦请大佬帮忙排查一下原因 Sep 27 15:15:29 localhost kernel: dio-p00-r[0][31316]: segfault at 0 ip (null) sp 00007f02bf838eb8 error 14 in fdfs_storaged[400000+58000] Sep 27 15:15:29 localhost abrt-hook-ccpp: Process 31305 (fdfs_storaged) of user 0 killed by SIGSEGV - dumping core Sep 27 15:15:29 localhost systemd-logind: Removed session 3990. Sep 27 15:15:29 localhost abrt-server: Executable '/usr/bin/fdfs_storaged' doesn't belong to any package and ProcessUnpackaged is set to 'no' Sep 27 15:15:29 localhost abrt-server: 'post-create' on '/var/spool/abrt/ccpp-2023-09-27-15:15:29-31305' exited with 1 Sep 27 15:15:29 localhost abrt-server: Deleting problem directory '/var/spool/abrt/ccpp-2023-09-27-15:15:29-31305' Sep 27 15:19:21 localhost kernel: [resguard_linux INFO filter.c:1646]: Trace chroot, /usr/sbin/sshd|0 -> /var/empty/sshd

happyfish100 commented 11 months ago

能否开启一下coredump,方便查看调用堆栈信息

heshc commented 10 months ago

开启了core dump后,storaged异常退出后没有产生core文件,为了测试coredump配置是否正常,使用测试脚本可以正常产生core文件。这个storaged 还需要其他配置项吗

image image

lystormenvoy commented 10 months ago

遇到类似问题:

[root@ava tmp]# gdb /usr/bin/fdfs_storaged core.fdfs_storaged.0.e5cecc6ddb184239bbfa09709e26832e.3349798.1696971935000000
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/fdfs_storaged...done.
[New LWP 455363]
[New LWP 455340]
[New LWP 455343]
[New LWP 455341]
[New LWP 455342]
[New LWP 455348]
[New LWP 455344]
[New LWP 455345]
[New LWP 455358]
[New LWP 455350]
[New LWP 455359]
[New LWP 455347]
[New LWP 455361]
[New LWP 455349]
[New LWP 455351]
[New LWP 455353]
[New LWP 455354]
[New LWP 455355]
[New LWP 455337]
[New LWP 455362]
[New LWP 455357]
[New LWP 455346]
[New LWP 455356]
[New LWP 455360]
[New LWP 455352]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/fdfs_storaged /etc/fdfs/storage.conf restart'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x000000000042272a in dio_thread_entrance (arg=0x10e00c8) at storage_dio.c:765
#2  0x00007f4259666ea5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4258f129fd in clone () from /lib64/libc.so.6
(gdb) q
happyfish100 commented 10 months ago

最新的v6.10已经修复了这个问题,这个版本即将发布,敬请期待。