shenango / caladan

Interference-aware CPU scheduling that enables performance isolation and high CPU utilization for datacenter servers
Apache License 2.0
117 stars 50 forks source link

dp_clients: failed to attach proc #4

Closed Yi-ran closed 3 years ago

Yi-ran commented 3 years ago

Hi Caladan developers,

I am running into some issues when trying to run Caladan.

Environment

Following instructions in Readme.txt. Intel NIC binds to the IGB UIO module.

Run the synthetic application in Step 5 (Server):

/caladan$ sudo ./iokerneld ias nobw
CPU 03| <5> cpu: detected 8 cores, 1 nodes
CPU 03| <5> time: detected 3407 ticks / us
[  0.000604] CPU 06| <5> sched: CPU configuration...
    node 0: [0,4][1,5][2,6][3,7]
[  0.000618] CPU 06| <5> sched: dataplane on 4, control on 0
[  0.026331] CPU 06| <5> control: spawning control thread
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:00:1f.6 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:15b7 net_e1000_em
EAL: PCI device 0000:01:00.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:01:00.1 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:10fb net_ixgbe
EAL: WARNING! Base virtual address hint (0x1100805000 != 0x7f14f0e3b000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
[  3.138086] CPU 04| <5> dpdk: driver: net_ixgbe port 0 MAC: 00 1b 21 88 d2 94
[  3.138098] CPU 04| <5> main: core 4 running dataplane. [Ctrl+C to quit]
/caladan$ ./apps/synthetic/target/release/synthetic 192.168.2.2:5000 --config server.config --mode spawner-server
CPU 06| <5> cpu: detected 8 cores, 1 nodes
CPU 06| <5> time: detected 3407 ticks / us
[  0.000229] CPU 06| <5> loading configuration from 'server.config'
[  0.000256] CPU 06| <5> process pid: 542683
[  0.013920] CPU 06| <5> net: started network stack
[  0.013935] CPU 06| <5> net: using the following configuration:
[  0.013938] CPU 06| <5>   addr:    192.168.2.2
[  0.013942] CPU 06| <5>   netmask: 255.255.255.0
[  0.013945] CPU 06| <5>   gateway: 192.168.2.1
[  0.013949] CPU 06| <5>   mac: 0E:47:46:C3:F3:AD
[  0.014025] CPU 06| <5> thread: created thread 0
[  0.014070] CPU 06| <5> spawning 4 kthreads
[  0.014147] CPU 03| <5> thread: created thread 1
[  0.014163] CPU 02| <5> thread: created thread 2
[  0.014185] CPU 05| <5> thread: created thread 3
192.168.2.2:5000

Run the synthetic application in Step 5 (Client. dp_clients: failed to attach proc):

~/caladan$ sudo ./iokerneld ias nobw
CPU 02| <5> cpu: detected 8 cores, 1 nodes
CPU 02| <5> time: detected 3408 ticks / us
[  0.000274] CPU 02| <5> sched: CPU configuration...
    node 0: [0,4][1,5][2,6][3,7]
[  0.000287] CPU 02| <5> sched: dataplane on 4, control on 0
[  0.025720] CPU 02| <5> control: spawning control thread
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:00:1f.6 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:15b7 net_e1000_em
EAL: PCI device 0000:01:00.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:01:00.1 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:10fb net_ixgbe
EAL: WARNING! Base virtual address hint (0x1100805000 != 0x7fe473b0a000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
[  1.980692] CPU 04| <5> dpdk: driver: net_ixgbe port 0 MAC: 00 1b 21 89 6e ac
[  1.980703] CPU 04| <5> main: core 4 running dataplane. [Ctrl+C to quit]
[258.711498] CPU 04| <2> dp_clients: failed to attach proc.
~/caladan$ ./apps/synthetic/target/release/synthetic 192.168.2.2:5000 --config client.config --mode runtime-client
CPU 03| <5> cpu: detected 8 cores, 1 nodes
CPU 03| <5> time: detected 3408 ticks / us
[  0.000281] CPU 03| <5> loading configuration from 'client.config'
[  0.000311] CPU 03| <5> process pid: 755087
[  0.004069] CPU 03| <5> net: started network stack
[  0.004085] CPU 03| <5> net: using the following configuration:
[  0.004088] CPU 03| <5>   addr:    192.168.2.3
[  0.004104] CPU 03| <5>   netmask: 255.255.255.0
[  0.004109] CPU 03| <5>   gateway: 192.168.2.1
[  0.004113] CPU 03| <5>   mac: BE:E6:DD:BF:9C:EA
[  0.004195] CPU 03| <5> thread: created thread 0
[  0.004240] CPU 03| <5> spawning 4 kthreads
[  0.004314] CPU 06| <5> thread: created thread 1
[  0.004333] CPU 07| <5> thread: created thread 2
[  0.004408] CPU 03| <5> thread: created thread 3
joshuafried commented 3 years ago

Hi Yiran - Can you share the contents of your client.config file? Thanks!

Yi-ran commented 3 years ago

Hi Yiran - Can you share the contents of your client.config file? Thanks!

client.config:

host_addr 192.168.2.3
host_netmask 255.255.255.0
host_gateway 192.168.2.1
runtime_kthreads 4
runtime_spinning_kthreads 4
joshuafried commented 3 years ago

Thanks! Try adding these two lines:

runtime_guaranteed_kthreads 4
runtime_priority lc

We will update the example configurations soon.

Yi-ran commented 3 years ago

Thanks! Try adding these two lines:

runtime_guaranteed_kthreads 4
runtime_priority lc

We will update the example configurations soon.

Thanks. I add these two lines and no error output. By the way, is the output at the client correct? I'm not very clear about the meaning of output. Is "Zero" -> distribution, "1000"-> target, etc?

~/caladan$ ./apps/synthetic/target/release/synthetic 192.168.2.2:5000 --config client.config --mode runtime-client
CPU 01| <5> cpu: detected 8 cores, 1 nodes
CPU 01| <5> time: detected 3408 ticks / us
[  0.000237] CPU 01| <5> loading configuration from 'client.config'
[  0.000260] CPU 01| <5> process pid: 755803
[  0.013777] CPU 01| <5> net: started network stack
[  0.013792] CPU 01| <5> net: using the following configuration:
[  0.013794] CPU 01| <5>   addr:    192.168.2.3
[  0.013798] CPU 01| <5>   netmask: 255.255.255.0
[  0.013800] CPU 01| <5>   gateway: 192.168.2.1
[  0.013803] CPU 01| <5>   mac: CA:96:26:97:38:AB
[  0.013872] CPU 01| <5> thread: created thread 0
[  0.013921] CPU 01| <5> spawning 4 kthreads
[  0.013992] CPU 02| <5> thread: created thread 1
[  0.014013] CPU 00| <5> thread: created thread 2
[  0.014040] CPU 01| <5> thread: created thread 3
Distribution, Target, Actual, Dropped, Never Sent, Median, 90th, 99th, 99.9th, 99.99th, Start
zero, 1000, 0, 0, 0, 1617896051
[ 24.512432] CPU 01| <5> init: shutting down -> SUCCESS
joshuafried commented 3 years ago

Great. The output below seems to indicate a connectivity issue. Distribution is the distribution from which the service times for each request are drawn (zero means no synthetic work is performed, and the packet is simply sent back). The target QPS is 1000, and the actual achieved QPS is 0. You may need to make sure that DPDK is selecting the correct NICs/ports and that there are no conflicts on the network with the IP addresses that you are using. You can recompile everything with the debugging enabled (https://github.com/shenango/caladan/blob/068f30e0d1d63ee745b3f03a5e2b7be560222fc2/build/config#L10) to get more verbose output that may be helpful in following where packets are not being received or sent.

Yi-ran commented 3 years ago

Great. The output below seems to indicate a connectivity issue. Distribution is the distribution from which the service times for each request are drawn (zero means no synthetic work is performed, and the packet is simply sent back). The target QPS is 1000, and the actual achieved QPS is 0. You may need to make sure that DPDK is selecting the correct NICs/ports and that there are no conflicts on the network with the IP addresses that you are using. You can recompile everything with the debugging enabled (

https://github.com/shenango/caladan/blob/068f30e0d1d63ee745b3f03a5e2b7be560222fc2/build/config#L10

) to get more verbose output that may be helpful in following where packets are not being received or sent.

The synthetic application runs correctly after I fixed the connectivity issue. Thanks a lot !