iqiyi / dpvs

DPVS is a high performance Layer-4 load balancer based on DPDK.
Other
3k stars 723 forks source link

FNAT with two-arm is not working #986

Open emusal opened 1 month ago

emusal commented 1 month ago

I am currently testing my software-based L4 load balancer using DPVS v1.9.6 in Full-NAT with a two-arm configuration. The following configuration was created based on your tutorial documentation.

### add VIP to WAN interface
./dpip addr add 192.168.250.201/24 dev dpdk0
./dpip addr add 172.30.1.100/24 dev dpdk1 sapool

## route for WAN/LAN access
## add routes for other network or default route if needed.
./dpip route add 172.30.1.0/24 dev dpdk1
./dpip route add 192.168.250.0/24 dev dpdk0

# add service <VIP:vport> to forwarding, scheduling mode is RR.
# use ipvsadm --help for more info.
./ipvsadm -A -u 192.168.250.201:67 -s rr --ops

# add two RS for service, forwarding mode is FNAT (-b)
./ipvsadm -a -u 192.168.250.201:67 -r 172.30.1.11:67 -b
./ipvsadm -a -u 192.168.250.201:67 -r 172.30.1.12:67 -b
./ipvsadm -a -u 192.168.250.201:67 -r 172.30.1.13:67 -b
./ipvsadm -a -u 192.168.250.201:67 -r 172.30.1.14:67 -b
./ipvsadm -a -u 192.168.250.201:67 -r 172.30.1.15:67 -b
./ipvsadm -a -u 192.168.250.201:67 -r 172.30.1.16:67 -b

## add at least one Local-IP (LIP) for FNAT on LAN interface
./ipvsadm --add-laddr -z 172.30.1.100 -u 192.168.250.201:67 -F dpdk1
# ./ipvsadm -Ln
IP Virtual Server version 1.9.6 (size=0)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
UDP  192.168.250.201:67 rr ops
  -> 172.30.1.11:67               FullNat 1      0          0         
  -> 172.30.1.12:67               FullNat 1      0          0         
  -> 172.30.1.13:67               FullNat 1      0          0         
  -> 172.30.1.14:67               FullNat 1      0          0         
  -> 172.30.1.15:67               FullNat 1      0          0         
  -> 172.30.1.16:67               FullNat 1      0          0         
# ./ipvsadm -G
VIP:VPORT            TOTAL    SNAT_IP              CONFLICTS  CONNS     
192.168.250.201:67   1        
                              172.30.1.100         0          0     

After applying this configuration, the ping tests to 192.168.250.201 and 172.30.1.100 are successful. However, when I send packets from the client to the load balancer, it receives the packets but does not forward them to backend side.

In another test, I switched the forwarding mode to Masquerade using the -m option in the 'ipvsadm' command. While the Masquerade mode operates correctly, the source address is not translated to the load balancer's local IP address as a matter of course.

The FNAT mode is crucial for my setup.

Could you please advise on how to properly configure the FNAT mode with a two-arm setup?

ywc689 commented 1 month ago

I didn't find any problem in your configuration. You may further analyse the problem with the following advices.

  1. check network connections between dpvs dpdk1 port and backends
  2. use ipvsadm -ln --stats to check if any connections ever created
  3. disable one packet schedule (ops) and try again
  4. test tcp forwarding instead
emusal commented 1 month ago

I didn't find any problem in your configuration. You may further analyse the problem with the following advices.

  1. check network connections between dpvs dpdk1 port and backends
  2. use ipvsadm -ln --stats to check if any connections ever created
  3. disable one packet schedule (ops) and try again
  4. test tcp forwarding instead

@ywc689 Thank you for your helpful advice. I have checked your advices, and the results are as follows.

  1. Checking network connections
    • I think the network connection between the DPVS DPDK1 port and the backends is ok because, as shown in the test result log below, when I send a packet in masquerade mode, the packets are successfully forwarded to the backends via the DPVS DPDK1 port.
      
      Every 2.0s: ./ipvsadm -Ln --stats                                                 DPVSLB1A: Wed Jul 31 00:44:20 2024

IP Virtual Server version 1.9.6 (size=0) Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes -> RemoteAddress:Port UDP 172.30.1.4:67 4858873 4858873 0 1409M 0 -> 172.30.1.11:67 485887 485887 0 140907K 0 -> 172.30.1.12:67 485887 485887 0 140907K 0 -> 172.30.1.13:67 485887 485887 0 140907K 0 -> 172.30.1.14:67 485887 485887 0 140907K 0 -> 172.30.1.15:67 485887 485887 0 140907K 0 -> 172.30.1.16:67 485887 485887 0 140907K 0 -> 172.30.1.17:67 485888 485888 0 140907K 0 -> 172.30.1.18:67 485888 485888 0 140907K 0 -> 172.30.1.19:67 485888 485888 0 140907K 0 -> 172.30.1.20:67 485887 485887 0 140907K 0

use ipvsadm --help for more info.

./ipvsadm -A -u 172.30.1.4:67 -s rr --ops

add two RS for service, forwarding mode is Masq (-m)

./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.11:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.12:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.13:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.14:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.15:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.16:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.17:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.18:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.19:67 -m ./ipvsadm -a -u 172.30.1.4:67 -r 172.30.1.20:67 -m

2. Checking statistics 
 - The statistics is not increasing while sending packets to the load balancer.

root@DPVSLB1A:~/dpvs/bin# ./ipvsadm -Ln --stats IP Virtual Server version 1.9.6 (size=0) Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes -> RemoteAddress:Port UDP 172.30.1.4:67 0 0 0 0 0 -> 172.30.1.11:67 0 0 0 0 0 -> 172.30.1.12:67 0 0 0 0 0 -> 172.30.1.13:67 0 0 0 0 0 -> 172.30.1.14:67 0 0 0 0 0 -> 172.30.1.15:67 0 0 0 0 0 -> 172.30.1.16:67 0 0 0 0 0 -> 172.30.1.17:67 0 0 0 0 0 -> 172.30.1.18:67 0 0 0 0 0 -> 172.30.1.19:67 0 0 0 0 0 -> 172.30.1.20:67 0 0 0 0 0 root@DPVSLB1A:~/dpvs/bin# ./ipvsadm -Ln IP Virtual Server version 1.9.6 (size=0) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn UDP 172.30.1.4:67 rr ops -> 172.30.1.11:67 FullNat 1 0 0
-> 172.30.1.12:67 FullNat 1 0 0
-> 172.30.1.13:67 FullNat 1 0 0
-> 172.30.1.14:67 FullNat 1 0 0
-> 172.30.1.15:67 FullNat 1 0 0
-> 172.30.1.16:67 FullNat 1 0 0
-> 172.30.1.17:67 FullNat 1 0 0
-> 172.30.1.18:67 FullNat 1 0 0
-> 172.30.1.19:67 FullNat 1 0 0
-> 172.30.1.20:67 FullNat 1 0 0

3. Without OPS
 - The same result is shown when using OPS.

root@DPVSLB1A:~/dpvs/bin# ./ipvsadm -Ln IP Virtual Server version 1.9.6 (size=0) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn UDP 172.30.1.4:67 rr -> 172.30.1.11:67 FullNat 1 0 0
-> 172.30.1.12:67 FullNat 1 0 0
-> 172.30.1.13:67 FullNat 1 0 0
-> 172.30.1.14:67 FullNat 1 0 0
-> 172.30.1.15:67 FullNat 1 0 0
-> 172.30.1.16:67 FullNat 1 0 0
-> 172.30.1.17:67 FullNat 1 0 0
-> 172.30.1.18:67 FullNat 1 0 0
-> 172.30.1.19:67 FullNat 1 0 0
-> 172.30.1.20:67 FullNat 1 0 0
root@DPVSLB1A:~/dpvs/bin# ./ipvsadm -Ln --stats IP Virtual Server version 1.9.6 (size=0) Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes -> RemoteAddress:Port UDP 172.30.1.4:67 0 0 0 0 0 -> 172.30.1.11:67 0 0 0 0 0 -> 172.30.1.12:67 0 0 0 0 0 -> 172.30.1.13:67 0 0 0 0 0 -> 172.30.1.14:67 0 0 0 0 0 -> 172.30.1.15:67 0 0 0 0 0 -> 172.30.1.16:67 0 0 0 0 0 -> 172.30.1.17:67 0 0 0 0 0 -> 172.30.1.18:67 0 0 0 0 0 -> 172.30.1.19:67 0 0 0 0 0 -> 172.30.1.20:67 0 0 0 0 0


4. Testing TCP forwarding
 - Sorry, I don't understand how to test TCP forwarding on the DPDK port. Could you please explain the testing procedure or provide additional information on how to perform the test?
emusal commented 1 month ago

Additionally, I have attached our dpvs.conf configuration below. Please check if there is anything wrong.

root@DPVSLB1A:~/dpvs/bin# cat /etc/dpvs.conf 
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! This is dpvs default configuration file.
!
! The attribute "<init>" denotes the configuration item at initialization stage. Item of
! this type is configured oneshoot and not reloadable. If invalid value configured in the
! file, dpvs would use its default value.
!
! Note that dpvs configuration file supports the following comment type:
!   * line comment: using '#" or '!'
!   * inline range comment: using '<' and '>', put comment in between
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

! global config
global_defs {
    log_level   DEBUG
    ! log_file    /var/log/dpvs.log
    ! log_async_mode    on
    kni               on
}

! netif config
netif_defs {
    <init> pktpool_size     524287
    <init> pktpool_cache    256

    <init> device dpdk0 {
        rx {
            queue_number        1
            descriptor_number   1024
        !    rss                 none
        }
        tx {
            queue_number        1
            descriptor_number   1024
        }
        ! mtu                   1500
        ! promisc_mode
        ! allmulticast
        kni_name                dpdk0.kni
    }
    <init> device dpdk1 {
        rx {
            queue_number        1
            descriptor_number   1024
        !    rss                 none
        }
        tx {
            queue_number        1
            descriptor_number   1024
        }
        ! mtu                   1500
        ! promisc_mode
        ! allmulticast
        kni_name                dpdk1.kni
    }
}

! worker config (lcores)
worker_defs {
    <init> worker cpu0 {
        type    master
        cpu_id  0
    }

    <init> worker cpu1 {
        type    slave
        cpu_id  1
        port    dpdk0 {
            rx_queue_ids     0
            tx_queue_ids     0
            ! isol_rx_cpu_ids  9
            ! isol_rxq_ring_sz 1048576
        }
        port    dpdk1 {
            rx_queue_ids     0
            tx_queue_ids     0
            ! isol_rx_cpu_ids  10
            ! isol_rxq_ring_sz 1048576
        }
    }
}

! timer config
timer_defs {
    # cpu job loops to schedule dpdk timer management
    schedule_interval    500
}

! dpvs neighbor config
neigh_defs {
    <init> unres_queue_length  128
    timeout                    60
}

! dpvs ipset config
ipset_defs {
    <init> ipset_hash_pool_size 131072
}

! dpvs ipv4 config
ipv4_defs {
    forwarding                 off
    <init> default_ttl         64
    fragment {
        <init> bucket_number   4096
        <init> bucket_entries  16
        <init> max_entries     4096
        <init> ttl             1
    }
}

! dpvs ipv6 config
ipv6_defs {
    disable                     off
    forwarding                  off
    route6 {
        <init> method           hlist
        recycle_time            10
    }
}

! control plane config
ctrl_defs {
    lcore_msg {
        <init> ring_size                4096
        sync_msg_timeout_us             20000
        priority_level                  low
    }
}

! ipvs config
ipvs_defs {
    conn {
        <init> conn_pool_size       2097152
        <init> conn_pool_cache      256
        conn_init_timeout           3
        ! expire_quiescent_template
        ! fast_xmit_close
        ! <init> redirect           off
    }

    udp {
        ! defence_udp_drop
        uoa_mode        opp
        uoa_max_trail   3
        timeout {
            oneway      60
            normal      300
            last        3
        }
    }

    tcp {
        ! defence_tcp_drop
        timeout {
            none        2
            established 90
            syn_sent    3
            syn_recv    30
            fin_wait    7
            time_wait   7
            close       3
            close_wait  7
            last_ack    7
            listen      120
            synack      30
            last        2
        }
        synproxy {
            synack_options {
                mss             1452
                ttl             63
                sack
                ! wscale        0
                ! timestamp
            }
            close_client_window
            ! defer_rs_syn
            rs_syn_max_retry    3
            ack_storm_thresh    10
            max_ack_saved       3
            conn_reuse_state {
                close
                time_wait
                ! fin_wait
                ! close_wait
                ! last_ack
           }
        }
    }
}

! sa_pool config
sa_pool {
    pool_hash_size  16
    flow_enable     on
}
ywc689 commented 1 month ago

As shown in your statistics, fullnat connection was not created. It suggests do the following checks and tests.

  1. Check if sapool on local IP is configured successfully.

    dpip addr show -s| grep sa_free -B1
  2. Disable UOA Please change the config udp/uoa_max_trail = 0 in dpvs.conf and try again. An alternative solution is to build and install dpvs uoa kernel module on each fullnat backend. Uoa data is inserted to the first uoa_max_trail packets for each connection to deliever the original client IP/Port to backend. The uoa kmod is responsible to parse the inserted data and then remove it from packet. Issue #940 reported a similar problem.

  3. Try TCP forwarding

    • Set up a testing http service on a backend server. For example, with python simple http module: python3 -m http.server 8082.
    • Configure a tcp fullnat forwarding service with the http backend.
    • Test the service with a http client or a web browser, such as curl: curl http://vip:vport/.