cloudwu / skynet

A lightweight online game framework
MIT License
13.36k stars 4.21k forks source link

进程退出时释放openssl内存时崩溃 #1517

Closed yxhtx closed 2 years ago

yxhtx commented 2 years ago

Linux version 3.10.0-327.el7.x86_64 gcc (GCC) 6.3.0 glibc 2.17

使用tlshelper加载tls.so后,进程退出时崩溃:

0 atomic_load_p (mo=atomic_memory_order_relaxed, a=0x0) at include/jemalloc/internal/atomic.h:62

1 rtree_leaf_elm_bits_read (dependent=true, elm=0x0, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:175

2 rtree_leaf_elm_szind_read (dependent=true, elm=0x0, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:227

3 rtree_szind_read (dependent=true, key=14, rtree_ctx=, rtree=, tsdn=) at include/jemalloc/internal/rtree.h:434

4 arena_salloc (ptr=0xe, tsdn=) at include/jemalloc/internal/arena_inlines_b.h:191

5 isalloc (ptr=0xe, tsdn=) at include/jemalloc/internal/jemalloc_internal_inlines_c.h:38

6 je_malloc_usable_size (ptr=ptr@entry=0xe) at src/jemalloc.c:3740

7 0x00000000004118d1 in clean_prefix (ptr=0xe <Address 0xe out of bounds>) at skynet-src/malloc_hook.c:129

8 free (ptr=0xe) at skynet-src/malloc_hook.c:242

9 0x00007f1e360ccb41 in OPENSSL_sk_pop_free () from /usr/local/openssl/lib/libcrypto.so.1.1

10 0x00007f1e363faf99 in ssl_library_stop () from /usr/local/openssl/lib/libssl.so.1.1

11 0x00007f1e3605bfb2 in OPENSSL_cleanup () from /usr/local/openssl/lib/libcrypto.so.1.1

12 0x00007f1e70290e69 in __run_exit_handlers () from /lib64/libc.so.6

13 0x00007f1e70290eb5 in exit () from /lib64/libc.so.6

14 0x00007f1e70279b1c in __libc_start_main () from /lib64/libc.so.6

15 0x000000000040923a in _start ()

我看了 #731 的描述,也看了openssl的代码,好像没有发现有类似的__libc_memalign代码。

cloudwu commented 2 years ago

要确定是否是 #731 类似的问题,可以把 jemalloc 关掉。方法是定义一个宏,比如在 macosx 上就是关闭的:https://github.com/cloudwu/skynet/blob/master/platform.mk#L37

这里 free 了一个 0xe 看起来是 NULL 偏移了一点。

yxhtx commented 2 years ago

0 __GI___libc_free (mem=0xe) at malloc.c:3111

1 0x00007efe2a0246b1 in OPENSSL_sk_pop_free () from /lib64/libcrypto.so.1.1

2 0x00007efe2a353b89 in ssl_library_stop () from /lib64/libssl.so.1.1

3 0x00007efe29fb4f62 in OPENSSL_cleanup () from /lib64/libcrypto.so.1.1

4 0x00007efe3d7ef4a1 in run_exit_handlers (status=0, listp=0x7efe3db6a738 <exit_funcs>, run_list_atexit=run_list_atexit@entry=true,

run_dtors=run_dtors@entry=true) at exit.c:108

5 0x00007efe3d7ef58a in __GI_exit (status=) at exit.c:139

6 0x00007efe3d7da015 in __libc_start_main (main=0x407580
, argc=3, argv=0x7fff7d979b48, init=, fini=,

rtld_fini=<optimized out>, stack_end=0x7fff7d979b38) at ../csu/libc-start.c:342

7 0x0000000000407bea in _start () at ../sysdeps/x86_64/start.S:120

GLIBC升级到了2.29,jemalloc也关掉了,依旧崩溃,请教一下,还有其他思路吗?

cloudwu commented 2 years ago

是否和这个问题有关? https://github.com/cloudwu/skynet/issues/1314

@lvzixun

lvzixun commented 2 years ago

现在openssl 默认被require 了就不会主动调用 ltls_init_destructor 了。 https://github.com/cloudwu/skynet/blob/master/lualib-src/ltls.c#L419-L433 你那边还require 了其他用了openssl的lib吗?是不是其他的lib调用了opensll cleanup 的接口。

yxhtx commented 2 years ago

我的代码版本是1.4.0的,同步了master的ltls.c后没问题了,感谢。