ofiwg / libfabric

Open Fabric Interfaces
http://libfabric.org/
Other
584 stars 386 forks source link

prov/rxm: segfault in fi_rdm_stress server at prov/rxm/src/rxm_msg.c:314 #7993

Open ldorau opened 2 years ago

ldorau commented 2 years ago

Describe the bug The server of fi_rdm_stress segfaults at prov/rxm/src/rxm_msg.c:314: https://github.com/ofiwg/libfabric/blob/main/prov/rxm/src/rxm_msg.c#L314

rxm_mr_msg_mr[i] = ((struct rxm_mr *) desc[i])->msg_mr;

for i == 0 because desc[i] == 0x0.

To Reproduce Steps to reproduce the behavior: 1) Start the server:

$  ./fi_rdm_stress -p verbs -s 192.168.1.4

2) Start the client:

$  ./fi_rdm_stress -p verbs -u ../test_configs/rdm_stress/stress.json  192.168.1.4

Expected behavior The server of fi_rdm_stress does not segfault, but runs correctly.

Output

$ cgdb --args ./fi_rdm_stress -p verbs -s 192.168.1.4
[...]
Thread 3 "fi_rdm_stress" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeae96700 (LWP 145200)]
0x00007ffff7adc13f in rxm_alloc_rndv_buf (rxm_ep=0x6c09a0, rxm_conn=0x7fff5c0045b8, context=0x7ffff7ef0010, count=1 '\001', iov=0x7fffeae95dc0, desc=0x7fffeae95d80, data_len=1000032, data=0,
 flags=16777216, tag=0, op=0 '\000', iface=FI_HMEM_SYSTEM, device=0, rndv_buf=0x7fffeae95d10) at prov/rxm/src/rxm_msg.c:314
(gdb) p i
$1 = 0
(gdb) p desc[i]
$2 = (void *) 0x0

Environment: provider: verbs

                 MR local: MSG - 1, RxM - 1
                 Completions per progress: MSG - 1
                 Buffered min: 88
                 Min multi recv size: 16384
                 inject size: 256
                 Protocol limits: Eager: 16384, SAR: 131072

Debugging information 1) mr_mode is FI_MR_LOCAL: https://github.com/pmem/libfabric/blob/main/fabtests/functional/rdm_stress.c#L1253

opts.mr_mode = ... | FI_MR_LOCAL | ... ;

so: 2) rxm_ep->rdm_mr_local is true : https://github.com/pmem/libfabric/blob/main/prov/rxm/src/rxm_ep.c#L1235

rxm_ep->rdm_mr_local = ofi_mr_local(rxm_ep->rxm_info);

but: 3) desc[0] == NULL, because fi_send() is called with desc == NULL in handle_hello() https://github.com/ofiwg/libfabric/blob/main/fabtests/functional/rdm_stress.c#L1006

ret = fi_send(ep, &resp->hdr, sizeof(resp->hdr), NULL, addr, resp);

4) It causes a segfault (NULL pointer dereference) at https://github.com/pmem/libfabric/blob/main/prov/rxm/src/rxm_msg.c#L305-L314

NOTICE

Removing FI_MR_LOCAL from mr_mode at: https://github.com/pmem/libfabric/blob/main/fabtests/functional/rdm_stress.c#L1253 causes that this bug does not appear, only the assertion occurs in the client:

fi_rdm_stress: prov/util/src/util_mem_monitor.c:160: ofi_monitor_cleanup: Assertion `dlist_empty(&monitor->list)' failed.

Additional information

libfabric:145723:1663043419:ofi_rxm:verbs:core:ofi_check_ep_type():667<info> unsupported endpoint type
libfabric:145723:1663043419:ofi_rxm:verbs:core:ofi_check_ep_type():668<info> Supported: FI_EP_DGRAM
libfabric:145723:1663043419:ofi_rxm:verbs:core:ofi_check_ep_type():668<info> Requested: FI_EP_MSG
libfabric:145723:1663043419:ofi_rxm:core:core:ofi_layering_ok():1027<info> Provider ofi_rxm is excluded
libfabric:145723:1663043419:ofi_rxm:core:core:ofi_layering_ok():1038<info> Need core provider, skipping ofi_rxd
libfabric:145723:1663043419:ofi_rxm:core:core:ofi_layering_ok():1038<info> Need core provider, skipping ofi_mrail
libfabric:145723:1663043419::ofi_rxm:core:fi_param_get_():279<info> variable sar_limit=<not set>
libfabric:145723:1663043419::ofi_rxm:core:rxm_ep_settings_init():1270<info> Settings:
                 MR local: MSG - 1, RxM - 1
                 Completions per progress: MSG - 1
                 Buffered min: 88
                 Min multi recv size: 16384
                 inject size: 256
                 Protocol limits: Eager: 16384, SAR: 131072
libfabric:145723:1663043419::verbs:ep_ctrl:vrb_pep_listen():525<info> listening on: fi_sockaddr_in://192.168.1.4:9228
ldorau commented 2 years ago

@grom72 @osalyk @haichangsi @Patryk717

ldorau commented 2 years ago

Hi @shefty , could you give a hint how it should be fixed? For example, is FI_MR_LOCAL required in mr_mode at: https://github.com/pmem/libfabric/blob/main/fabtests/functional/rdm_stress.c#L1253

ldorau commented 2 years ago

AFAIK, msg_mr can be created only by .regv == vrb_mr_regv or .regattr == vrb_mr_regattr hooks, but none of them is called in the rdm_stress test, so usage of FI_MR_LOCAL seems suspicious for me (or this test just lacks one of these calls).

shefty commented 1 year ago

I missed that I was tagged on this way back when.

The rdm_stress tests is not coded to handle FI_MR_LOCAL correctly. At least one missing piece is in start_rcp(). After the resp buffer is allocated, the resp data needs to be registered if FI_MR_LOCAL is specified. The struct rpc_resp already has a mr field for this purpose, which is closed in complete_rpc().

I'd consider a set of changes along these lines:

static uint64_t rpc_resp_reg_flags[cmd_last] = {
    0,
    0,
    FI_SEND,
    0,
    FI_SEND,
    FI_READ,
    FI_WRITE,
};

static void start_rpc(...)
{
    ...
    resp = calloc(...)

    if (need FI_MR_LOCAL && rpc_resp_reg_flags[req->cmd])
        fi_mr_reg(...)
    ...
}