idealvin / coost

A tiny boost library in C++11.
Other
3.97k stars 561 forks source link

hook失效 #351

Open eVen-p opened 9 months ago

eVen-p commented 9 months ago

Hi

我在linux下发现可能存在hook未生效的问题,一些系统api会阻塞线程,而hook后依旧阻塞了线程(没有变成阻塞协程)。我写了一个小demo来进行验证,发现如果我的demo链接静态库libco.a那么hook是正常的,如果链接动态库libco.so,看起来hook没有生效。

如下是demo代码:

// test.cpp
#include "co/co.h"
#include "co/os.h"
#include <iostream>

void f() {
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd == -1) {
        std::cout << "invalid socket" << std::endl;
        return;
    }        

    struct sockaddr_in serverAddr;
    serverAddr.sin_family = AF_INET;
    serverAddr.sin_port = htons(5064); // 某个端口
    serverAddr.sin_addr.s_addr = inet_addr("xx.xx.xx.xx"); // 某个ip地址

    // 阻塞api1
    if (connect(sockfd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) == -1) {
        std::cout << "connect failed!" << std::endl;
        return;
    }

    std::cout << "connected" << std::endl;

    char buffer[1024];
    size_t numBytes = recv(sockfd, buffer, 1023, 0); // 阻塞api2
    if (numBytes == -1) {
        std::cout << "recv failed" << std::endl;
    } else {
        std::cout << "good" << std::endl;
    }

    close(sockfd);
}

int main() {
    co::wait_group wp(1);

    for(int i = 0; i < os::cpunum(); i++) {
        go(f);
    }

    go([&wp] {
        // 如果所有线程都被阻塞,这里就不会被执行
        std::cout << "hi" << std::endl;
        wp.done();
    });

    std::cout << "start to wait" << std::endl;

    wp.wait();
    std::cout << "ok" << std::endl;

    co::sleep(1000);

    return 0;
}

最终结果是,如果链接libco.a,程序可以正常退出;如果链接了动态库libco.so,那么程序阻塞,无法退出。

我尝试了3.0.0和3.0.1,现象都是一样的。我的libco.a和libco.so均由xmake编译得到。

另外我也尝试了在3.0.0上使用cmake编译,发现无论是静态库还是动态库,均无法正常退出。请问是否是我使用的方式不对呢?

附上我demo的CMakeLists.txt:

cmake_minimum_required(VERSION 3.13)
project(test)

#aux_source_directory(<dir> <variable>):查找指定目录下的所有源文件,然后将结果存进指定变量。
aux_source_directory(${CMAKE_CURRENT_SOURCE_DIR}/../src SOURCE_SRC)

set(EXECUTABLE_OUTPUT_PATH ../out)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -ldl -lpthread")

include_directories(${CMAKE_CURRENT_SOURCE_DIR}/../include/)

MESSAGE(STATUS "\n\n==== Starting ${PROJECT_NAME} CMAKE Build ====")

add_executable(${PROJECT_NAME} ${SOURCE_SRC})

target_link_libraries(${PROJECT_NAME}
                            ${CMAKE_CURRENT_SOURCE_DIR}/../lib/libco.so
                            )

install(TARGETS ${PROJECT_NAME} DESTINATION bin)

demo目录结构:

├── build │ └── CMakeLists.txt ├── include │ └── co... ├── src │ └── test.cpp ├── lib │ └── libco.a └── out

系统:CentOS7,gcc 4.8.5

idealvin commented 9 months ago

@eVen-p 动态库,链接时 libco.so 放最前面

eVen-p commented 9 months ago

@eVen-p 动态库,链接时 libco.so 放最前面

@idealvin 我尝试了不使用cmake,而是在命令行直接编译我的demo,保证-l链接时libco在最前面:

g++ -std=c++11 test.cpp -o ../out/test -I/my_path_to_libco/include/ -L/my_path_to_libco/lib -lco -ldl -lpthread

然后运行demo程序,得到的结果还是一样的

idealvin commented 9 months ago
    co::wait_group wp(os::cpunum());      // 初始值与 done()协程数一致

    for(int i = 0; i < os::cpunum(); i++) {
        go(f);
    }

    go([wp] {   // --------------传值
        // 如果所有线程都被阻塞,这里就不会被执行
        std::cout << "hi" << std::endl;
        wp.done();
    });

    std::cout << "start to wait" << std::endl;

    wp.wait();
eVen-p commented 9 months ago

感谢指正,我已经把wp改成传值。

另外wait_group的初始值,原来的初值1已经是和done()调用次数一致,下面的go([wp] ...)只调用一次,for里面的go不涉及wait_group

这两处更改我都做了尝试,demo的运行结果还是没有变化,看起来阻塞的依旧是线程。 另外我在hook.cc中的函数加了打印,demo中的两个调用connect, recv没有走到这里面去,没被hook掉。

idealvin commented 9 months ago

对端有发送数据吗?没数据的话,"recv"就是卡住的

另外,程序启动时加上 -co_hook_log 可以打印 hook 相关的日志,看看 hook 是否正常

eVen-p commented 9 months ago

对端有发送数据吗?没数据的话,"recv"就是卡住的

对端没有发送数据,我构造这个demo的想法就是,让这个api阻塞住,如果阻塞的是线程,那main就会卡在wp.wait()无法退出,如果阻塞的是协程,那么main函数就可以正常退出。现在我使用libco.a静态库就可以正常退出,而libco.so动态库就发现退出不了,程序卡在那不动

idealvin commented 9 months ago

不会阻塞线程

hook日志有吗?

eVen-p commented 9 months ago

我尝试了换了一个系统,macOS上面是正常的,demo行为都符合预期

关于hook日志,运行时我加了-co_hook_log,不过没看到有多的打印在终端出现。我已在main开头加了flag::parse(argc, argv);。我手动把DEF_bool(co_hook_log, false, ...改成true也是没看到其他打印

idealvin commented 9 months ago

我尝试了换了一个系统,macOS上面是正常的,demo行为都符合预期

关于hook日志,运行时我加了-co_hook_log,不过没看到有多的打印在终端出现。我已在main开头加了flag::parse(argc, argv);。我手动把DEF_bool(co_hook_log, false, ...改成true也是没看到其他打印

mac 与 linux hook机制不一样

日志默认打印到文件中,加 -cout 参数可以打印到终端

eVen-p commented 9 months ago

linux下使用静态库时demo打印如下(os::cpunum() == 2):

start to wait
hi
ok
connected
connected
D0104 09:15:58.877 2413 hook.cc:209] hook socket, sock: 9, non_block: false
D0104 09:15:58.877 2414 hook.cc:209] hook socket, sock: 10, non_block: false
D0104 09:15:58.877 2414 hook.cc:471] hook connect, fd: 10, r: 0
D0104 09:15:58.877 2413 hook.cc:471] hook connect, fd: 9, r: 0
D0104 09:15:59.890 2412 hook.cc:906] hook nanosleep, ms: 1000, r: 0

链接动态库时打印如下:

start to wait
connected
connected
// 此时阻塞在这
^C
idealvin commented 9 months ago

@eVen-p

抱歉,近期事多,回复晚了

看起来是动态库中符号没导出的问题, 这里 改成下面这样试试:

#define _hook(f) __coapi f
eVen-p commented 9 months ago

尝试了一下动态库,输出如下:

[root@localhost build]# ../out/test -co_hook_log -cout
start to wait
connected
connected
D0105 09:28:17.811 3393 hook.cc:209] hook socket, sock: 9, non_block: false
D0105 09:28:17.811 3392 hook.cc:209] hook socket, sock: 10, non_block: false
^C

多了socket的hook log,但是后面的log就没有了;程序依旧阻塞。

另外查看了符号表:

[root@localhost out]# nm test | grep socket
                 U socket
[root@localhost out]# nm test | grep connect
                 U connect@@GLIBC_2.2.5
[root@localhost out]# nm test | grep recv
                 U recv@@GLIBC_2.2.5
idealvin commented 9 months ago

connect, recv 链接的是系统中的,没 hook 到

看看 libco.so 的符号表,有没有 connect...

eVen-p commented 9 months ago

动态库如下

[root@localhost lib]# nm libco.so | grep connect
0000000000025a50 T connect
00000000002a1dc0 b _sys_connect
0000000000041990 T _ZN2co7connectEiPKvii
000000000006a0e0 t _ZN3rpc10ClientImpl7connectEv
0000000000066c40 t _ZN3rpc10ServerImpl13on_connectionEN3tcp10ConnectionE
0000000000075e40 T _ZN3ssl7connectEPvi
0000000000074290 t _ZN3tcp10ServerImpl17on_ssl_connectionEi
0000000000074130 t _ZN3tcp10ServerImpl17on_tcp_connectionEi
00000000000727f0 T _ZN3tcp6Client10disconnectEv
0000000000072860 T _ZN3tcp6Client7connectEi
00000000000717e0 T _ZN3tcp6Server13on_connectionEOSt8functionIFvNS_10ConnectionEEE
0000000000079840 t _ZN4http10ServerImpl13on_connectionEN3tcp10ConnectionE
[root@localhost lib]# 
[root@localhost lib]# nm libco.so | grep recv
00000000002a2130 b FLG_http_recv_timeout
00000000002a20fc b FLG_rpc_recv_timeout
000000000002a8f0 T recv
000000000002ac80 T recvfrom
000000000002b040 T recvmsg
00000000002a1da0 b _sys_recv
00000000002a1d98 b _sys_recvfrom
00000000002a1d90 b _sys_recvmsg
0000000000041df0 T _ZN2co4recvEiPvii
00000000000421d0 T _ZN2co5recvnEiPvii
00000000000422d0 T _ZN2co8recvfromEiPviS0_Pii
0000000000075e50 T _ZN3ssl4recvEPvS0_ii
0000000000075e60 T _ZN3ssl5recvnEPvS0_ii
000000000006c0e0 T _ZN3tcp10Connection4recvEPvii
000000000006c0f0 T _ZN3tcp10Connection5recvnEPvii
0000000000071a60 T _ZN3tcp6Client4recvEPvii
0000000000071a80 T _ZN3tcp6Client5recvnEPvii
0000000000075140 t _ZN3tcp7SSLConn4recvEPvii
0000000000075150 t _ZN3tcp7SSLConn5recvnEPvii
0000000000075070 t _ZN3tcp7TcpConn4recvEPvii
0000000000075080 t _ZN3tcp7TcpConn5recvnEPvii
[root@localhost lib]# 
[root@localhost lib]# nm libco.so | grep socket
0000000000022600 T socket
0000000000022d90 T socketpair
00000000002a1e20 b _sys_socket
00000000002a1e18 b _sys_socketpair
000000000003ff00 T _ZN2co6socketEiii
0000000000075170 t _ZN3tcp7SSLConn6socketEv
0000000000074ff0 t _ZN3tcp7TcpConn6socketEv
000000000006c0d0 T _ZNK3tcp10Connection6socketEv

另外静态库的符号表如下:

[root@localhost lib]# nm libco.a | grep connect
0000000000003390 T connect
00000000000000c0 B _sys_connect
                 U _ZN2co7connectEiPKvii
                 U _sys_connect
0000000000001ae0 T _ZN2co7connectEiPKvii
0000000000003d10 T _ZN3rpc10ClientImpl7connectEv
0000000000000760 T _ZN3rpc10ServerImpl13on_connectionEN3tcp10ConnectionE
                 U _ZN3tcp6Client10disconnectEv
                 U _ZN3tcp6Client7connectEi
                 U _ZN3tcp6Server13on_connectionEOSt8functionIFvNS_10ConnectionEEE
                 U _ZN2co7connectEiPKvii
                 U _ZN3ssl7connectEPvi
0000000000008050 T _ZN3tcp10ServerImpl17on_ssl_connectionEi
0000000000007ef0 T _ZN3tcp10ServerImpl17on_tcp_connectionEi
00000000000065a0 T _ZN3tcp6Client10disconnectEv
0000000000006610 T _ZN3tcp6Client7connectEi
0000000000005590 T _ZN3tcp6Server13on_connectionEOSt8functionIFvNS_10ConnectionEEE
00000000000006a0 T _ZN3ssl7connectEPvi
                 U _ZN3tcp6Server13on_connectionEOSt8functionIFvNS_10ConnectionEEE
0000000000003870 T _ZN4http10ServerImpl13on_connectionEN3tcp10ConnectionE
[root@localhost lib]# 
[root@localhost lib]# nm libco.a | grep recv
0000000000008000 T recv
0000000000008380 T recvfrom
0000000000008730 T recvmsg
00000000000000a0 B _sys_recv
0000000000000098 B _sys_recvfrom
0000000000000090 B _sys_recvmsg
                 U _sys_recv
                 U _sys_recvfrom
0000000000001f40 T _ZN2co4recvEiPvii
0000000000002320 T _ZN2co5recvnEiPvii
00000000000023f0 T _ZN2co8recvfromEiPviS0_Pii
0000000000000014 B FLG_rpc_recv_timeout
                 U _ZN3tcp10Connection4recvEPvii
                 U _ZN3tcp10Connection5recvnEPvii
                 U _ZN3tcp6Client5recvnEPvii
                 U _ZN2co4recvEiPvii
                 U _ZN2co5recvnEiPvii
                 U _ZN3ssl4recvEPvS0_ii
                 U _ZN3ssl5recvnEPvS0_ii
0000000000000070 T _ZN3tcp10Connection4recvEPvii
0000000000000080 T _ZN3tcp10Connection5recvnEPvii
0000000000005810 T _ZN3tcp6Client4recvEPvii
0000000000005830 T _ZN3tcp6Client5recvnEPvii
0000000000000000 W _ZN3tcp7SSLConn4recvEPvii
0000000000000000 W _ZN3tcp7SSLConn5recvnEPvii
0000000000000000 W _ZN3tcp7TcpConn4recvEPvii
0000000000000000 W _ZN3tcp7TcpConn5recvnEPvii
00000000000006b0 T _ZN3ssl4recvEPvS0_ii
00000000000006c0 T _ZN3ssl5recvnEPvS0_ii
0000000000000010 B FLG_http_recv_timeout
                 U _ZN3tcp10Connection4recvEPvii
                 U _ZN3tcp10Connection5recvnEPvii
[root@localhost lib]# 
[root@localhost lib]# nm libco.a | grep socket
0000000000000000 T socket
00000000000007c0 T socketpair
0000000000000120 B _sys_socket
0000000000000118 B _sys_socketpair
                 U _sys_socket
0000000000000080 T _ZN2co6socketEiii
                 U _ZNK3tcp10Connection6socketEv
                 U _ZN2co6socketEiii
0000000000000000 W _ZN3tcp7SSLConn6socketEv
0000000000000000 W _ZN3tcp7TcpConn6socketEv
0000000000000060 T _ZNK3tcp10Connection6socketEv
                 U _ZNK3tcp10Connection6socketEv
idealvin commented 8 months ago

libco.so里已经导出 connectrecv 这些函数了,那就是链接的问题。 用 xmake 编译试试,xmake -v 打印详细的编译参数,看看链接顺序..