SymbioticLab / Infiniswap

Infiniswap enables unmodified applications to efficiently use disaggregated memory.
239 stars 49 forks source link

Failure with kernel 4.11.0-13-generic and MLNX OFED 4.1 #19

Open blakecaldwell opened 4 years ago

blakecaldwell commented 4 years ago

Hello--I'm trying to get Infiniswap to work on 4.11 to perhaps see if there is a performance benefit from improvements in the linux block layer over 3.13, but I'm getting the error message below on the bd client. Are these the components that the README is referring to in mentioning 4.11 kernel support?

A nearly identical setup worked with 14.04 and kernel 4.4 (same OFED). The module loads fine and nbdxadm establishes the connection. However mkswap /dev/infiniswap0 produces the log messages below. When using the swap device, all I/O will go to disk.

Client logs:

[  610.783911] rdma_resolve_addr - rdma_resolve_route successful
[  610.783916] IS_setup_qp: enabling unsafe global rkey
[  610.783983] created pd ffff8f4a5902e280
[  610.785580] created cq ffff8f4a578b6a00
[  610.786266] created qp ffff8f4a5853f000
[  610.786267] IS: IS_setup_buffers called on cb ffff8f425bb5e000
[  610.786267] IS: size of IS_rdma_info 584
[  610.786270] IS: cb->mem=1 
[  610.786271] IS: IS_setup_buffers, in cb->mem==DMA 
[  610.786292] IS: allocated & registered buffers...
[  610.809119] cma_event type 9 cma_id ffff8f4a585df000 (parent)
[  610.809121] ESTABLISHED
[  610.809126] rdma_connect successful
[  610.809188] IS: client receives unknown msg
[  610.809189] IS: recv wc error: -1

I've added debugging lines to confirm that the received message size is 584, but the type received is 0. You can see below that the deamon thinks the type sent is 4.

Daemon output:

listening on port 9400.
rdma_session_init, get free_mem 61
rdma_session_init, allocated mem 48
free_mem, is called, last 13 GB, weight: 0.700000, 0.300000
received connection request.
connection build
send_free_mem_size , 48
message size = 584
RDMA sending type 4
ATCP commented 3 years ago

I encountered the same problem, have you had the solution?

Ke