microsoft / demikernel

Kernel-Bypass LibOS Architecture
https://aka.ms/demikernel
MIT License
873 stars 118 forks source link

[inetstack] ARP Should Support Concurrent Requests for Resolution of Same IPv4 Address #385

Open ArchangelSDY opened 1 year ago

ArchangelSDY commented 1 year ago

Description

Demikernel panicked when connecting to a same host concurrently.

How to Reproduce

  1. Machine A (10.180.0.10) runs an ordinary Nginx server at port 80.
  2. Machine B (10.180.0.5) runs following client code which concurrently connect to 10.180.0.10:80:
#define _POSIX_C_SOURCE 200809L

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <demi/libos.h>
#include <demi/sga.h>
#include <demi/wait.h>
#include <string.h>
#include <errno.h>

#include <arpa/inet.h>
#include <sys/socket.h>

#define MAX_QTS 10240
static demi_qtoken_t    *qts;
static uint32_t        nqts;

void build_sockaddr(const char *const ip_str, const char *const port_str, struct sockaddr_in *const addr)
{
    int port = -1;

    sscanf(port_str, "%d", &port);
    addr->sin_family = AF_INET;
    addr->sin_port = htons(port);
    assert(inet_pton(AF_INET, ip_str, &addr->sin_addr) == 1);
}

void add_event(demi_qtoken_t qt)
{
    assert(nqts < MAX_QTS);
    qts[nqts] = qt;
    nqts++;
}

void remove_event(int i)
{
    assert(nqts > 0);
    nqts--;
    qts[i] = qts[nqts];
}

void do_connect(const struct sockaddr_in *saddr)
{
    int sockqd;
    demi_qtoken_t qt;
    assert(demi_socket(&sockqd, AF_INET, SOCK_STREAM, 0) == 0);
    assert(demi_connect(&qt, sockqd, (const struct sockaddr *)saddr, sizeof(struct sockaddr_in)) == 0);
    add_event(qt);
}

int main(int argc, char *const argv[])
{
    struct sockaddr_in saddr = {0};
    demi_qresult_t qr = {0};
    int err = 0;
    int offset = 0;
    struct timespec timeout = {
        .tv_sec = 1,
        .tv_nsec = 0
    };

    assert(demi_init(argc, argv) == 0);

    qts = malloc(sizeof(demi_qtoken_t) * MAX_QTS);
    if (!qts) {
        printf("fail to alloc qts\n");
        return 1;
    }
    nqts = 0;

    build_sockaddr(argv[1], argv[2], &saddr);

    do_connect(&saddr);
    do_connect(&saddr);

    while (1) {
        err = demi_wait_any(&qr, &offset, qts, nqts, &timeout);

        if (err) {
            assert(err == ETIMEDOUT);
            continue;
        }

        remove_event(offset);

        switch (qr.qr_opcode) {
        case DEMI_OPC_CONNECT:
            printf("connected\n");
            break;

        default:
            printf("unexpected opcode: %d\n", qr.qr_opcode);
        }
    }

    return 0;
}

Command to run:

sudo CONFIG_PATH=/home/azureuser/config.yaml LIBOS=catnip MTU=1500 MSS=1500 RUST_LOG=trace RUST_BACKTRACE=1 bin/multi-connect 10.180.0.10 80

Got panic:

TRACE [demikernel::demikernel::bindings] demi_init()                                                                                                                       
EAL: Detected CPU lcores: 8                                                                                                                                                
EAL: Detected NUMA nodes: 1                                                                                                                                                
EAL: Auto-detected process type: PRIMARY                                                                                                                                   
EAL: Detected shared linkage of DPDK                                                                                                                                       
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket                                                                                                                      
EAL: Selected IOVA mode 'PA'                                                                                                                                               
EAL: VFIO support initialized                                                                                                                                              
EAL: Probe PCI driver: mlx5_pci (15b3:1016) device: ee41:00:02.0 (socket 0)                                                                                                
mlx5_common: Failed to allocate DevX UAR (BF/NC)                                                                                                                           
mlx5_common: Failed to allocate UAR.                                                                                                                                       
mlx5_net: Failed to prepare Tx DevX UAR.                                                                                                                                   
mlx5_net: probe of PCI device ee41:00:02.0 aborted after encountering an error: Operation not permitted                                                                    
mlx5_common: Failed to load driver mlx5_eth                                                                                                                                
EAL: Requested device ee41:00:02.0 cannot be used                               
EAL: Bus (pci) probe failed.                                                         
net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3) 
EAL: Probe PCI driver: mlx5_pci (15b3:1016) device: ee41:00:02.0 (socket 0)                                                                                     [171/94990]
mlx5_common: Failed to allocate DevX UAR (BF/NC)                                                                                                                           
mlx5_common: Failed to allocate UAR.                                                                                                                                       
mlx5_net: Failed to prepare Tx DevX UAR.                                                                                                                                   
mlx5_net: probe of PCI device ee41:00:02.0 aborted after encountering an error: Operation not permitted                                                                    
mlx5_common: Failed to load driver mlx5_eth                                                                                                                                
EAL: Driver cannot attach the device (ee41:00:02.0)                                                                                                                        
EAL: Failed to attach device on primary process                                                                                                                            
net_failsafe: sub_device 0 probe failed (No such file or directory)                                                                                                        
tap_nl_dump_ext_ack(): Cannot delete qdisc with handle of zero                                                                                                             
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid                                                                                                         
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid                                                                                                         
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid                                                                                                         
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid                                                                                                         
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid                                                                                                         
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid                                                                                                         
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid              
tap_nl_dump_ext_ack(): Failed to find qdisc with specified classid                   
TELEMETRY: No legacy callbacks, legacy socket not created                            
DPDK reports that 1 ports (interfaces) are available.                                
dev_info: rte_eth_dev_info { device: 0x564bec143120, driver_name: 0x7f5a5a2ca5ef, if_index: 0, min_mtu: 68, max_mtu: 65535, dev_flags: 0x1003a8934, min_rx_bufsize: 0, max_
rx_pktlen: 1522, max_lro_pkt_size: 0, max_rx_queues: 16, max_tx_queues: 16, max_mac_addrs: 1, max_hash_mac_addrs: 0, max_vfs: 0, max_vmdq_pools: 0, rx_seg_capa: rte_eth_rx
seg_capa { _bitfield_align_1: [], _bitfield_1: __BindgenBitfieldUnit { storage: [0] }, max_nseg: 0, reserved: 0 }, rx_offload_capa: 8206, tx_offload_capa: 32814, rx_queue_
offload_capa: 8206, tx_queue_offload_capa: 0, reta_size: 0, hash_key_size: 40, flow_type_rss_offloads: 241596, default_rxconf: rte_eth_rxconf { rx_thresh: rte_eth_thresh {
 pthresh: 0, hthresh: 0, wthresh: 0 }, rx_free_thresh: 0, rx_drop_en: 0, rx_deferred_start: 0, rx_nseg: 0, share_group: 0, share_qid: 0, offloads: 0, rx_seg: 0x0, rx_mempo
ols: 0x0, rx_nmempool: 0, reserved_64s: [0, 0], reserved_ptrs: [0x0, 0x0] }, default_txconf: rte_eth_txconf { tx_thresh: rte_eth_thresh { pthresh: 0, hthresh: 0, wthresh: 
0 }, tx_rs_thresh: 0, tx_free_thresh: 0, tx_deferred_start: 0, offloads: 0, reserved_64s: [0, 0], reserved_ptrs: [0x0, 0x0] }, vmdq_queue_base: 0, vmdq_queue_num: 0, vmdq_
pool_base: 0, rx_desc_lim: rte_eth_desc_lim { nb_max: 65535, nb_min: 0, nb_align: 1, nb_seg_max: 65535, nb_mtu_seg_max: 65535 }, tx_desc_lim: rte_eth_desc_lim { nb_max: 65
535, nb_min: 0, nb_align: 1, nb_seg_max: 65535, nb_mtu_seg_max: 65535 }, speed_capa: 0, nb_rx_queues: 0, nb_tx_queues: 0, max_rx_mempools: 0, default_rxportconf: rte_eth_d
ev_portconf { burst_size: 0, ring_size: 0, nb_queues: 0 }, default_txportconf: rte_eth_dev_portconf { burst_size: 0, ring_size: 0, nb_queues: 0 }, dev_capa: 3, switch_info
: rte_eth_switch_info { name: 0x0, domain_id: 65535, port_id: 0, rx_domain: 0 }, err_handle_mode: 0, reserved_64s: [0, 0], reserved_ptrs: [0x0, 0x0] }                     
Port 0 Link Up - speed 10000 Mbps - full duplex 
TRACE [demikernel::demikernel::bindings] demi_socket()                                                                                                                     
TRACE [demikernel::inetstack] socket(): domain=2 type=1 protocol=0                                                                                                         
TRACE [demikernel::demikernel::bindings] demi_connect()                                                                                                                    
TRACE [demikernel::inetstack] connect(): qd=QDesc(1000000) remote=10.180.0.10:80
TRACE [demikernel::inetstack] connect() qt=QToken(4)                                 
TRACE [demikernel::demikernel::bindings] demi_socket()                               
TRACE [demikernel::inetstack] socket(): domain=2 type=1 protocol=0                   
TRACE [demikernel::demikernel::bindings] demi_connect()                                                                                                                    
TRACE [demikernel::inetstack] connect(): qd=QDesc(1000001) remote=10.180.0.10:80                                                                                           
TRACE [demikernel::inetstack] connect() qt=QToken(6)                                                                                                                       
TRACE [demikernel::demikernel::bindings] demi_wait_any() 0x7ffe696d9710 0x7ffe696d96ec 0x564bec1a9aa0 2 0x7ffe696d96f0                                                     
TRACE [demikernel::demikernel::libos] wait_any(): qts=[QToken(4), QToken(6)], timeout=Some(1s)
WARN [demikernel::catnip::runtime::memory::mempool] allocating mbuf from DPDK pool                                                                                         
thread '<unnamed>' panicked at 'Duplicate waiter for 10.180.0.10', src/rust/inetstack/protocols/arp/peer.rs:143:13                                                         
stack backtrace:                                                                                                                                                           
   0: rust_begin_unwind                                                                                                                                                    
             at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:575:5                                                                         
   1: core::panicking::panic_fmt                                                                                                                                           
             at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/panicking.rs:65:14                                                                        
   2: demikernel::inetstack::protocols::arp::peer::ArpPeer::do_wait_link_addr                                                                                              
   3: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                                                                                   
   4: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                                                                                   
   5: demikernel::scheduler::scheduler::Scheduler::poll                         
   6: demikernel::inetstack::InetStack::poll_bg_work                                 
   7: demikernel::demikernel::libos::LibOS::wait_any                                 
   8: demi_wait_any                                                                  
   9: main                                                                                                                                                                 
  10: __libc_start_main                                                                                                                                                    
  11: _start                                                                                                                                                               
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.  

Screenshots

If applicable, add screenshots to help explain your problem.

Environment

Expected Behavior

No panic.

Related Issues

This is essentially the same as Issue #242.

BrianZill commented 1 year ago

The bug here is the current ARP code doesn't support multiple concurrent requests for resolution of the same IPv4 address.

We should fix this when we fix how the ARP code is structured.

ppenna commented 1 year ago

@BrianZill is this a duplicate? https://github.com/demikernel/demikernel/issues/242

BrianZill commented 1 year ago

@BrianZill is this a duplicate? #242

Yes, it is. This Issue has more information about the problem, however, so I think we should just leave them linked together, unless GitHub has some better way to include this information into the existing Issue.